Finding words in a file present in a different file - text

I have a file with a list of properties.
Name
Description
BogusProperty_the_first
The full file has some 200 properties
I also have an xml file that references properties in the previous list, containing entries like;
<Item value="#Name#" length="32" description="Name" />
I want to remove from the first file all entries that are/are not present in the second file.
I do not need a perfect fit, it's OK if I treat some entries as being present in the second file when in fact they are not, so it's sufficient to test that "Description" occurs somewhere in the second file, I don't need to test that value="#Description#" occurs in a tag at the appropriate place in the DOM.
It would be bad to treat entries in the first file as not being in the second file if in fact they where.
The solution does not need to be completely automated or a single button click, but I do not want to check each item in the first file separately.
I am using notepad++, but would be open to using other tools if applicable.
The problem is small enough that writing a separate program to handle it, while straight forward, would not be worth it.

While writing the question I realised that notepad++ can solve this by copy pasting the second file into a copy of the first file.
The procedure I used was the following;
Write a seperate line in the first file with text that does not occur in the either of the two files. In my case I used asdf1234.
Copy the contents of the second file into the first.
Search for the following regexp with ".matches newlines" checked.
(?:\n|\r)([^\n\r]+)(?=(?:\n|\r).*asdf1234.*\1)
Replace with nothing.
???
Profit
To keep the entries that do occur in the second file use (?:\n|\r)([^\n\r]+)(?=(?:\n|\r))(?!.*asdf1234.*\1) to search.

Related

Edit Input CSV file (or copy of it) as each row processed

Long story short, after a crash course in Python/BeautifulSoup, I managed to create a script to take an input text file that contains a list of URLs (1 on each line), scrape the URL, and write output to a database. There are some cases where I want an error to exit the script (including some trapped errors as well as unexpected), but as the list of URLs to scrape is pretty large, it would be handy if I could edit the input text file (or create a copy and edit that) to remove each URL as it is successfully processed. The idea being that if the script exits (by trap or crash), I'd have a list of the URLs left to be processed. Is something like this possible? I can find code samples to edit the text file, but I'm getting stuck at how to take out the row just processed.
Finally came across the post here that achieves the answer, though I'm not positive it's the most efficient way as it's reading the entire file and saving each time, but that may be the best that can be done in Python. In my case, the file is in the 1200 lines range, so it easily fits into memory.

PowerBI: Importing data from mixed folder files

I got a question related to the importing different files from the folder into Power Query (Power BI). When I say different, in my case are .xslx and .txt files. Actually just one text file but it is important to be inside report. Excel files are and will be always consistent as it is shown down in a first picture only with date as a dynamic part, but inside are consistently structured, so I just have to put it into folder and hit refresh into Power Query and magic.. that works fine, but I got also that .txt file which has completely different information but still connected to the report (because there is a field of date/time inside with additional information). My question is how or what is good approach to have all these files inside one or more queries?
As you can see on the second picture (from PQ editor) in the content part on the last position is .txt file, which I "isolated" when right click on it and "Add as a new query", and then I need to do editing and so on. Is there maybe another approach to solve this? One problem I discovered is when I change path of the file, all queries are refreshed but not this one with .txt - even though I changed path completely in the Advanced editor. Simply gives an error. Has anyone idea how to deal with different files from one folder, assuming that you need all the files from inside?
If you do not want 2 folders, your approach for isolating the txt is appropriate. about refreshing issues: if you expanded the data by clicking combine, Excel must have created other queries and parameters ("Sample from....") you must change the path in those queries too.

How to delete null line in file using sqlldr, ctl

Let me know how to delete the null line using sqlldr, ctl.
And I wanna know how to remove the last two line of files.
There are null lines in tail, that is the last 1~2 line.
Plus I cannot know the last line number.
wait to reply😥
You need to either pre-process the file and remove blank lines before running sqlldr via a wrapper script, or more common, just load all rows from the file into a staging table, then call a PL/SQL script from there to load into the main table.
Pre-processing alters the main file so that is usually not a good idea unless of course you make an archive copy first.
Using a staging table is more common as that way all rows from the file are available and you can select the rows you want, transforming, validating, etc the data on the way into the main table.

How to search for files faster in Sublime Text 3

Right now I do ⌘t then scroll through autocomplete, or start typing the name (but half the time it doesn't find it).
Sublime doesn't find a file in many cases. For example, I typically have all my files called index.<ext> nested inside some folder. So I might have:
my/long/directory/structure/index.js
my/long/directory/structure2/index.js
my/long/directory/structure3/index.js
my/long/directory/structure.../index.js
my/long/directory/structuren/index.js
my/long/directory/index.js
my/long/directory2/index.js
my/long/directory.../index.js
my/long/directoryn/index.js
my/long/index.js
my/index.js
...
But in sublime you have to search for an exact path. I can't search this:
my directory index
And get results for directory, directory2, directory..., directoryn, I just get empty results because there is not my/directory. I can't remember the full folder path most of the time, so it takes a lot of effort to do so and I end up just navigating in the sidebar to find the file which takes some time.
Wondering if there is a better/faster way of doing this. Basically searching for a file by snippets/keywords of the complete path. So m dir would return my/long/directory, etc.
The first thing to note is that you do not have to search for an exact path; anywhere that Sublime provides you a list of items to select from and a text entry, fuzzy matching is in play. In your example searching just for idx will narrow down the list to all items that have those characters in that order, even if they're not adjacent to each other.
The entries show you visually how they're matching up, and there's a fairly sophisticated system behind the scenes that decides which characters make the best matches (relative to some hidden scoring algorithm):
In addition to this you can use multiple space separated terms to filter down the list. Each term is applied to the list of items resulting from the prior term, so they don't need to be provided in the same order as they appear in the file names.
This helps with searches where you know generally the name of the file, and from there can further drill down on segments of the path or other terms that will help narrow things down:
Something to note here is that as seen in these images, the folder structure is my/long/directory/structure, but the names of the files as seen in the panel don't include the my/ at the start.
In cases where your project contains only one top level folder, that folder isn't presented in the names of the files. Presumably this is because it's common to every file and thus not going to be a useful filter. As such trying to use my in the search field will return no matches unless one of the files has an m and a y somewhere in their filenames.
This isn't the case if there are multiple top level folders; in that case Sublime will include the root folder in the names of the files presented because now it's required to be able to distinguish between files in the different folders:
In addition to this, note that for any given filter text you enter in a panel, Sublime remembers the full text of the item that you selected while that filter text was being used, and uses that in it's scoring to prioritize the matches the next time you search in the same panel. The next time you search with the same term, Sublime will automatically pre-select the item that you picked last time under the theory that you probably want it again.
The search terms and their matches are saved in the session file and in your project's sublime-workspace files, so as you move from window to window and project to project you're essentially training Sublime how to find the files that you want.
My advice would be to try and flip your thinking a little bit. In my opinion the power of the fuzzy matching algorithm works best when you try to find files in a more organic way than trying to replicate the path entirely.
Instead, I would throw a few characters from the name of the file that I'm trying to find first, and then add another term that filters on some part of the path that will disambiguate things more; a term of idx s1 in this example immediately finds the two index.js files that are contained in structure1 folders, for example.
In a more real world example the names of the folders might contain the names of the components that they're a part of or something else that is providing a logical structure to the code, so you might do idx con to pull the index.js from the controller folder or idx mod to find the one in the model folder, and so on.
Regarding a better/faster way to do this I don't think there is one, at least in the general case. Sublime inherently knows every file that's in your project as a part of indexing all of the files to power other features such as Goto Symbol and it uses file watchers to detect changes to the structure of the open folders.
Anything else, including a third party plugin or package, would need to first do a redundant file scan to accumulate the list of files and would also have to replicate the file watching that Sublime is already doing in order to know when things change.

Is it possible to insert a file into an exe?

I need to insert a generated file into an exe at the time of download. Currently, I create an "empty" file (filled with a repeating character) and package that with the exe. When it comes time to download, I look at the bytes for the installer, find the file by looking for the repeating character, and insert the generated file.
This process however is not working. The repeating character just does not show in the bytes. But I am certain the file is there as it is unpacked if I run the exe. Am I doing something wrong or is inserting a file into an exe even possible?
Also note that I am using Inno Setup Script v5.5.1 to compile the project into an exe.
If you want to change the contents of a file specified in a [Files] entry and compiled into the setup executable, then you must:
Make a dummy file that is at least as large as the largest content you will want to insert.
Fill the file (or at least the first 64 bytes or so) with something unique and easily distinguishable.
Mark its [Files] entry with the "nocompression noencryption dontverifychecksum" flags.
You should then be able to scan the resulting executable for the marker in #2 and then substitute the data that you want. Note however that doing this might invalidate any digital signature on the setup file, although I haven't tested this to be sure.
Note that if the content you are inserting is smaller than the dummy file size, the extra bytes will still remain on the end of your inserted content. So whatever reads the file will have to have some way to ignore that or to recognise the end of the interesting content.
So, if your are making changes in the existing exe file, and if the text is not much, you can probably use some hex editor and make changes at desired location. If text is more , you might want to include some meaningless bytes, just as fillers.

Resources