How to clean text

These are some ways I have cleaned and formatted text when coding
Remove “curly” or ‘smart’ quotes
A situation that can happen when copying and pasting a list of
items
from a text document or PDF file into in a programming langauge
is that it may contain curly/smart quotes.
Software like Microsoft Word and
Powerpoint can use smart quotes ”“ ‘’ which leads to these
quotes appearing in text. Most programming
langauges use straight quotes for making strings "hi". I am
not aware of a popular programming langauge
that uses curly or smart quotes for strings. This is why it's
necessary to fix this formatting issue and convert
curly quotes into straight quotes when writing code.
Example :
“a”, “b”, “c”, “d”, “e”
num_list = [“a”, “b”, “c”, “d”, “e”]
If I tried to do make this list in python and compile my code, I
would get this error.
SyntaxError: invalid character '“' (U+201C)
You can clean this text by using the "special character fixes"
tool on this website cleanertext.com
and
applying it to your text. You simply check off the boxes and
then the text with straight
quotes is shown.
num_list = ["a", "b", "c", "d", "e"]
python would appreciate parsing that list
Remove duplicate spaces
You may encounter text that has extra spaces or
unnecessary line breaks making it difficult to read.

There aren't many out of the box software or solutions for the
average person to clean text
with extra spaces. One way to remove extra spaces is by using
regular expressions like \s+
to find spaces , tabs, linebreaks, and other space characters.
Then replacing them with a single
space character. Some text editors support regular expressions but for
most people regexs may either be
confusing or difficult to understand.
There is an easy to use space cleaner tool on cleanertext.com where you simply check off
"remove duplicate spaces" and then press clean text and voila
your text no longer has unnecessary spaces.
Fix math symbols
If you are copying a basic math formula from a text document or
pdf file they may use special characters
that aren't useful when writing code for example.

The symbol × is what most people know to be multiplication
symbol but for
programmers * is used instead when doing math. And for division
/ is used instead of ÷ in math equations.
I have personally ran into this situation many times when
copying formulas to use in my code
and then needing to change the character to the correct math
operator/symbol.
Under the "special character fixes" section on cleanertext.com check off
" × → * " and " ÷ → / " then press clean and your text will be
cleaned! You can also use text editors to clean this text
but it requires copying and pasting the multiplication and
division symbols and then specifying which characters you want
to
find and replace. You can also hand delete each occurence of these symbols
and replace it with its corresponding math operator.
With my tool you simply check off what characters you want to change and then apply the changes.