I work on a Web app and we recently decided that we’re just not gonna support double quotes in free text fields because oh holy balls what a thing it is to try to deal with those in a way that doesn’t open you up to multiple encoding vulnerabilities.
It’s a way bigger pain in the ass than people think it is. I remember having to parse output from a tool for work that had tons of output in tabular format, mixed with normal sentence like strings. JSON, YAML, or XML outputs weren’t available so I had to do a nasty mess of grep, awk, cut, and head/tail, to get what I wanted. My first attempt was literally counting the characters so I could cut out exactly what I needed, but as we all know, hardcoding values is a recipe for headaches later on.
Here’s a horror story from literally yesterday. We have been fighting a system for a client for weeks and it has been a nightmare. Our clients just told us that they outsourced some of their work to an Indian outfit but that outfit is unfamiliar with Linux and doesn’t know how to edit text files so they have been downloading the files to their Windows machines, editing them in Windows, then uploading the contaminated text files back into Linux. None of them, not our client nor the outfit they hired, understood why this was a problem. We have no idea what files are affected and we won’t know until they fail because they obviously did not keep track of what they touched.
Does windows add an extra character at the end that gets converted to new line on linux? Because the other day I were copying a script and after pasting it an extra line was added after every single line, even the empty lines.
“\ “ and [tab] and * are your friends. I’ve been using spaces in Unix filesystems since the early 90s with no issues. Also, using terminal fonts that•put•a•faint•dot•in•each•space•character helps.
This is fine for the most basic of use cases but once you start looping through file names or what have you, you have to start writing robust correct bash and nobody does that
Yeah but at least with periods in the title tab complete will just complete the file name all the way while with a filename with spaces I have to escape the damn space with "\ " like you said. Why do more work when I don’t have to?
Dealing with spaces while scripting or in terminal is such a pain in the ass. The true dark path of horror is using spaces indeed.
[This comment has been deleted by an automated system]
I work on a Web app and we recently decided that we’re just not gonna support double quotes in free text fields because oh holy balls what a thing it is to try to deal with those in a way that doesn’t open you up to multiple encoding vulnerabilities.
It’s a way bigger pain in the ass than people think it is. I remember having to parse output from a tool for work that had tons of output in tabular format, mixed with normal sentence like strings. JSON, YAML, or XML outputs weren’t available so I had to do a nasty mess of grep, awk, cut, and head/tail, to get what I wanted. My first attempt was literally counting the characters so I could cut out exactly what I needed, but as we all know, hardcoding values is a recipe for headaches later on.
Here’s a horror story from literally yesterday. We have been fighting a system for a client for weeks and it has been a nightmare. Our clients just told us that they outsourced some of their work to an Indian outfit but that outfit is unfamiliar with Linux and doesn’t know how to edit text files so they have been downloading the files to their Windows machines, editing them in Windows, then uploading the contaminated text files back into Linux. None of them, not our client nor the outfit they hired, understood why this was a problem. We have no idea what files are affected and we won’t know until they fail because they obviously did not keep track of what they touched.
EDIT: I’m being intentionally vague.
If this is about line endings, surely a simple shell or python script could correct them?
There’s already a command for it:
https://linux.die.net/man/1/dos2unix
Does windows add an extra character at the end that gets converted to new line on linux? Because the other day I were copying a script and after pasting it an extra line was added after every single line, even the empty lines.
how it looked when I copied it:
what it turned into:
Windows uses CR LF (carriage return, line feed), whereas Unix just uses LF. For added fun, macs use CR.
This used to be true, for sure, but I thought this changed with OS X (which is essentially PrettyBSD) ?
You’re right. Notepad++ still lists macs as using CR for their EOL conversion tool, so I didn’t realize.
The only reasonable response to this behavior is disproportionate violence
You can just grep for carriage returns followed by newlines,
grep -Pirn '\r\n$' /path/to/whatever
. It’ll identify all your problematic files.“\ “ and [tab] and * are your friends. I’ve been using spaces in Unix filesystems since the early 90s with no issues. Also, using terminal fonts that•put•a•faint•dot•in•each•space•character helps.
Yeah, either put quotes around it ‘/like this/you can incorporate/spaces/into your paths’ or /just\ escape/your\ spaces/like\ this
This is fine for the most basic of use cases but once you start looping through file names or what have you, you have to start writing robust correct bash and nobody does that
It gets real crazy when you’re sending remote commands so you have to escape the escapes so that the remote keeps them and properly escapes the space
ssh -t remote "mv /home/me/folder\\\ with \\\ spaces /home/me/downloads/
Does SSH require quoting commands?
It doesn’t for commands without spaces (i.e
reboot
) You might be able to escape the spaces and not use quotes, I’m not sureMight be client-dependent; I’ve regularly ran commands with spaces (e.g.
ssh a@a.local ssh b@b.local
) without a problem.Yup, this is me with
scp
. Well, it would be if I didn’t just use asterisks to avoid that PITA.Yeah but at least with periods in the title tab complete will just complete the file name all the way while with a filename with spaces I have to escape the damn space with "\ " like you said. Why do more work when I don’t have to?
My shell seems to autocomplete filenames that have spaces with "\ " already.
Yeah I was gonna say this is something anyone in tech knows, spaces are a plague