Problem
I'm using Log Parser 2.2 to read IIS logs and import them to SQL Server. Everything works fine except the cs-username field: if it contains special characters, those get messed up during database import. As this is SharePoint web application and it uses external trusted provider, I cannot control nor get rid of the special characters in user names before they are stored in IIS logs.
So, the IIS logs contain hits with user names such as:
#Software: Microsoft Internet Information Services 8.5
#Version: 1.0
#Date: 2016-10-01 07:27:28
#Fields: date time s-ip cs-method cs-uri-stem cs-uri-query s-port cs-username c-ip cs(User-Agent) cs(Referer) cs-host sc-status
2016-10-01 07:28:15 10.0.2.4 GET /_layouts/images/advadd.png - 443 0ǵ.t|claims+idp|12345678 111.222.223.224 Mozilla/5.0 https://my.site.com/page.aspx my.site.com 200
And after Log Parser imports those to DB, user name has changed to 0ǵ.t|claims+idp|12345678
Same applies if I test outputting it to Data Grid with
LogParser.exe "SELECT cs-username FROM f:\iislogs\dev3\*.log" -i:IISW3C -o:datagrid
--> I see 0ǵ.t|claims+idp|12345678 instead of 0ǵ.t|claims+idp|12345678.
Solution
As IIS Log files are stored using UTF-8, you need to add -iCodepage 65001 to the Log Parser command, like so:
logparser.exe file:AnalyticsQuery.sql -i:IISW3C -o:SQL -database:AnalyticsDB -server:sqlserver -iCodepage:65001
And of course remember to set your table column as nvarchar or other n-type.
both looks incorrect, as the usual format is
ReplyDelete0#.w| or 0#.i|, but nice trick ;)
I guess you can implement the trusted provider like you want, but that solution doesn't look standard.
the editor doesnt like < > apparently...
Deletew|username or i|upn was what i wrote. Those characters! :)
It is standard, check this out for details: http://www.wictorwilen.se/Post/How-Claims-encoding-works-in-SharePoint-2010.aspx
Delete