Find and replace help, to remove certain things from text - text

I have file which contain 18k lines of text which consists of links and rondom ID codes and looks like this:
"
http://arduino.cc/en/Main/ArduinoBoardNano
SC09661
http://arduino.cc/en/Main/ArduinoBoardUno
http://www.farnell.com/datasheets/1639172.pdf
SC09670
http://arduino.cc/en/Main/ArduinoBoardUno
SC09665
http://arduino.cc/en/Main/ArduinoEthernetShield
SC09662
http://arduino.cc/en/Main/ArduinoXbeeShield
CS23020
http://bcove.me/zypzpy2q
SC09147
http://cache.national.com/ds/LM/LM134.pdf
SC08546
http://cache.national.com/ds/LM/LM2574.pdf
SC08540
http://cache.national.com/ds/LM/LM2576.pdf
"
I need to remove from this text all those ID codes (SC08540,SC09662,...) and links which not ends with .pdf, I know its posible with Notepad++ and other programs, with Replace funkction, but I dont know how exacly should I do this. Maybe I could get help with this?

I have not found a way to do this in one go with Notepad++ but this should work:
Open the replace box (Search --> Replace...) and select Regular expression
Search for ^.*[^\.][^p][^d][^f]$
Make sure Replace with is empty
Replace All
Now you have a file with empty lines and the links you want. There are at least two ways to get rid of the empty lines:
Method 1: TextFX plugin
Select all text
TextFX --> TextFX Edit --> Delete blank lines
Method 2: Replace
Make sure the cursor is at the beginning of the document
Open the replace box (Search --> Replace...) and select Extended
Search for \n\r
Make sure Replace with is empty
Replace All

Related

How to remove invisible line break character

I have big data at excel, and some cells contains html codes. These cells have line breaks in them. I tried to replace line breaks (Alt+010, \n) but excel said there is no char like this.
When I copied cell to notepad, there is no line break.
When I copied from notepad to phpmyadmin sql area or textpad, I see line breaks again.
There are notepad, textpad and phpmyadmin sql area screenshots below. How can I remove these invisible line breaks?
This could be a problem with Carriage Return + Line Feed. When you press Alt+Enter in Excel it only incerts a Line Feed. But if you somehow get both Carriage Return + Line Feed in a cell that could leed to additional problems. See this page for solutions:
https://www.ablebits.com/office-addins-blog/2013/12/03/remove-carriage-returns-excel/
Did you try to remove any unnecessary tab within the code? Also check for some trivial things like e.g string max length in your mysql database or editor's miscellaneous settings.
EDIT. oh, I forgot. It may be also caused by your language settings, check for default database's regional coding preset and if Turkish is currently supported.
Line breaks - do you mean the line breaks you could introduce in Excel with ALT+ENTER?
Then you could use Search / Replace option in Excel without need to copy your content to another tool:
Open it and introduce in Search for CTRL+J (you will receive a point displayed in the search field).
In Replace you could introduce what you want (nothing, a space, a semicolon, ...).
Select Replace all.
EDIT:
I've tested it by copying html from textpad to one cell using clipboard. With this the method described by me is not working.
But there is another solution: Open replace command, for "search string" introduce ALT-Key (keep it pressed), then introduce by using the numeric key pad (on the right side of a "standard" keyboard) the tree digits 0 1 0 and finally release ALT-Key (you will see a point displayed in the search field). Choose as replacement string what you want and choose replace all.
Function =clean() helped me. Find/replace with ALT+J worked to replace, but did not fully deleted all the invisible characters in the string, so the cell was still misbehaving with text in columns. The =clean() function finally removed all the invisible characters left there.

Adobe Brackets: how can I find/replace multiple lines?

is there a way to find multiple lines within a project e.g.
<li>ABC</li>
<li>LKJ</li>
and replace it with: e.g.
<li>...</li>
<li>ABC</li>
<li>...</li>
<li>...</li>
<li>LKJ</li>
?
Thank you!
Brackets has the ability to use regex commands. There's alittle button with [.*]next the search field . Click that to activate it. So to do what you want your search field would be something like:
(<li>ABC</li>)(<li>LKJ</li>)
And your replace would be:
<li>...</li>$1<li>...</li><li>...</li>$2
Find > Replace In Files
Choose files
Set search and replace text
Review changes
Replace
http://blog.brackets.io/2014/06/27/brackets-0-41-release-replace-across-files/
Brackets has more powerful shortcuts to help replace any matched string.
Ctrl + H can be used to Find and Replace Strings without Regex

Remove lines that doesn't contains a string with Sumlimes

I have 6 huge text files, and i need to filter them by deleting all the lines that doesn't contains the string: 53=S.
For 5 of them, i managed to filter the files with notepad++ as follow:
Find --> Mark --> Bookmark Lines --> Mark All --> Search --> Bookmarks -- > Remove Unbookmarked Lines
However, the application collapsed for a specific file each time i tried it. I tried it in two PCs with the same result.
Anyone know how can i remove the irrelevant lines with Sublimes or any other tool?
You could try a regular expression replace in notepad++.
Using notepad++, press ctrl+h or bring up the search>replace window.
In the 'find what' text box enter ^(?m)^(?:(?!53=S).)*$
and leave the 'replace with' text box empty
Make sure the search mode is set to 'Regular Expression' and then hit 'Replace All'
This should remove any line that doesn'tcontain the string 53=S
There is a notepad++ plugin called LineFilter (not LineFilter2), which provides a menu with entries like
delete all lines containing the selection
delete all lines not containig the selection
it opens a new tab with the result. That worked on large files. I liked it a lot.
The plugin is available from Notepad++ Plugin Central.
If you have grep available, then grep should do the trick, too.

How to remove line endings on multiple lines in visual studio 2012

In the visual studio 2012 editor, I don't need to remove entire or multiple blank lines as all of the other stuff I could search on S/O is concerned with. I want to select multiple lines of aspx markup (usually from 2 to 10 or so) and remove the line endings on multiple lines of source code so that you end up with everything that was in the selected lines on one line. A small example of what I want to do is:
BEFORE source code:
[dx:GridViewTextColumn ID="Inactive"
Width="50"]
[/dx:GridViewTextColumn]
AFTER:
[dx:GridViewTextColumn ID="Inactive" Width="50"][/dx:GridViewTextColumn]
(replace the "[" chars above with "<" chars, and "]" with ">" chars, I entered it that way just to get it to display somewhat properly here)
It seems like this should be pretty simple, but I have tried various search/replace and regex values that are talked about in the many articles that talk about removing entire blank lines, but can't get anything to work. A little help? :)
** 2014-02-06 at 2306 hrs update:
Still trying, but this gets me really close:
In Visual Studio 2012 ide.
Working in an aspx file with its xml markup.
Do Ctrl-H to do a string search/replace on a selected stretch of xml (from start to finish of a particular well-formed tag, which may or may not contain subtags, but each tag is on a separate line).
Specify the following in the from textbox:
\s{2,}
Specify one blank space in the to textbox.
For the first example I gave, the result would be:
[dx:GridViewTextColumn ID="Inactive" Width="50"] [/dx:GridViewTextColumn]
Note the single blank space between the separate tags (after the first ']' character and before the second '[' character) . If I could figure out how to not have that space in the result it would be perfect.
** 2014-02-06 at 2322 hrs update:
Oh duh. Otay, I think I got it, the from textbox value is the same, but for the to textbox value, instead of a single space, just have nothing. That seems to work for my specific use case (selecting a certain amount of xml within aspx markup and making it all be on a single line with no extraneous blank spaces). Yay!
In the VS2012 ide, Ctrl-H dialog, specify regex, then in the from textbox put:
\s{2,}
and in the to textbox put nothing.
Execute on whatever amount of xml that you have selected (from the beginning tag less-than character to its matching ending tag greater-than character, along with any/all subtags in between, over any number of lines of source code).
What you have selected will be condensed to one line of source code without any extraneous blank spaces present.
This seems to work for the xml that is in .NET aspx markup (what I specifically needed); not guaranteed to work anywhere else.

Remove non utf8 lines in text file

How do i remove only non utf8 keywords/lines in a text file.
eg.
你好
相手123abc
this is only abc
I only want to remove lines that contain all english words and not the lines with utf8 words. So in this case only 'this is only abc' will be removed. Is it possible to do it in notepad++ or do i need to write a script for it?
This is possible using the following steps;
Open Notepad++ select the Find menu and select the last tab 'Mark', enter the regex ^(([a-zA-Z])+\s?)+, select Bookmark Line, and click the button 'Mark All'.
From the drop down menu select; Search --> Bookmark --> Remove Bookmarked Lines
I would also recommend making sure Notepad++ is up to date. I tested this with version 6.3. Marking lines is something added quite recently.

Resources