amenditman Posted January 14, 2024 Posted January 14, 2024 I am trying to rescue an xml file that is over 5.4 GB in size by trimming lines according to the application developers advice. All my attempts to open and work with the file in a text editor end in frustration because the editor freezes for hours. I have been able to get the text of the beginning deletion line and the end deletion line, over several days of just waiting or the editor to display my selections. I think sed would be able to manage this much faster and am trying to write the script. Failing to see any result as the output file ends up the same as the input. Here is what I have, need help to get it to work as desired. sed '/<tiles viewLevels="PROVINCE" tilesWide="25920" tilesHigh="19440">/,/<tiles>/d' <OceanWorld-Copy.wxx> OceanWorld-Trimmed.wxx Any suggestins where I am wrong? 1 Quote
crp Posted January 14, 2024 Posted January 14, 2024 back in the day when i had issues like this I used split , did the work I did an each segment created and then recreated the file by joining the edited segments. Quote
V.T. Eric Layton Posted January 14, 2024 Posted January 14, 2024 Way too techie for this Luddite. I've never sed nothing in my entire life. Quote
amenditman Posted January 14, 2024 Author Posted January 14, 2024 Well, that sucks. I've come up with a script which should delete the offending lines and tried several variations of the syntax. No luck. It just reads thru the entire file without making the deletions. Quote
sunrat Posted January 15, 2024 Posted January 15, 2024 I'm sure you searched already but I was curious and found this - https://techstop.github.io/delete-lines-strings-between-two-patterns-sed/ The applicable bit would seem to be: Delete all the lines between PATTERN-2 and PATTERN-3: sed -i '/PATTERN-2/,/PATTERN-3/{//!d}' file.txt And hopefully you have a backup copy of the file for justin. 1 Quote
crp Posted January 17, 2024 Posted January 17, 2024 On 1/13/2024 at 5:17 PM, amenditman said: '/<tiles viewLevels="PROVINCE" tilesWide="25920" tilesHigh="19440">/,/<tiles>/d' <OceanWorld-Copy.wxx> OceanWorld-Trimmed.wxx an escape character is not needed before the > and < ? Quote
amenditman Posted January 17, 2024 Author Posted January 17, 2024 On 1/14/2024 at 10:43 PM, sunrat said: I'm sure you searched already but I was curious and found this - https://techstop.github.io/delete-lines-strings-between-two-patterns-sed/ The applicable bit would seem to be: Delete all the lines between PATTERN-2 and PATTERN-3: sed -i '/PATTERN-2/,/PATTERN-3/{//!d}' file.txt And hopefully you have a backup copy of the file for justin. I will modify my one-liner as you suggest and try it. Nothing to lose. Yes, justin is on the case. Quote
amenditman Posted January 17, 2024 Author Posted January 17, 2024 9 hours ago, crp said: an escape character is not needed before the > and < ? In sed, as I understand it (and that ain't too well), those are not escape characters, but are framing the patterns to be searched. 1 Quote
crp Posted January 18, 2024 Posted January 18, 2024 6 hours ago, amenditman said: In sed, as I understand it (and that ain't too well), those are not escape characters, but are framing the patterns to be searched. That is not what I asked. I asked if escape characters were not needed for those characters which are also input/output directives. This might depend if the sed one is using is POSIX. And maybe the command shell? Quote
crp Posted January 18, 2024 Posted January 18, 2024 On 1/13/2024 at 5:17 PM, amenditman said: I am trying to rescue an xml file that is over 5.4 GB in size by trimming lines according to the application developers advice. All my attempts to open and work with the file in a text editor end in frustration because the editor freezes for hours. I have been able to get the text of the beginning deletion line and the end deletion line, over several days of just waiting or the editor to display my selections. I think sed would be able to manage this much faster and am trying to write the script. Failing to see any result as the output file ends up the same as the input. Here is what I have, need help to get it to work as desired. sed '/<tiles viewLevels="PROVINCE" tilesWide="25920" tilesHigh="19440">/,/<tiles>/d' <OceanWorld-Copy.wxx> OceanWorld-Trimmed.wxx Any suggestins where I am wrong? did you see this (long) thread yet? https://stackoverflow.com/questions/407523/escape-a-string-for-a-sed-replace-pattern Quote
Recommended Posts
Join the conversation
You can post now and register later. If you have an account, sign in now to post with your account.