Working on an input extractor issue with IIS logs using an "advanced" IIS login tool to collect more than the basic logs provide. It's adding double quotes and spaces to many of the fields and we are trying to us the extractor to correct this. This is the beginning of an example message:
2016-02-08 16:46:35.957 "SITE" "SOURCE" XX.XX.XX.XX GET /blah/etc/etc/file.ext - 80 - "XX.XX.XX.XX" "HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; yie11; rv:11.0) like Gecko"
We've already written an extractor to remove all the added quotes before running it through all the other extractors to populate the fields, etc., but we want to replace all spaces between the quotes with
+ before we do that to match the old logging style.
Can anyone point us in the right direction for this? The closest I've come so far is catching
" " between SITE and SOURCE and replacing that using something like
2016-02-08 16:46:35.957 "SITE+SOURCE" XX.XX.XX.XX GET /blah/etc/etc/file.ext - 80 - "XX.XX.XX.XX+HTTP/1.1+Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; yie11; rv:11.0) like Gecko"
I can't seem to only look for spaces between the quotes.
Any help would be greatly appreciated. Thanks.
Further Clarification. This portion of the string:
"Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; yie11; rv:11.0) like Gecko"
Everything else should remain the same as those are the only spaces inside of a quoted section of the string.
Is this even possible with regex?