I have around 150 files with the same .txt extension, but the filenames are just alphanumeric strings (e.g. 7J9E45600.txt, FF5632088.txt, etc.). I have a list where the alphanumeric strings are matched to more meaningful names. I want to replace these alphanumeric strings with the meaningful names, but would like to do it programatically. Most of the existing solutions allow to rename multiple files with incrementally increasing numbers, e.g. via a loop command, but in my case all the filenames will be different. An example of what I want to do is as follows: rename 7J9E45600.txt to adipose.txt, rename FF5632088.txt to brain.txt, etc. A solution utilizing Linux, R, Perl or Python is most welcome.
Yes, this is easy to do with a for loop in R.
Make or read in your data, with a column of old names, and a matching column containing the new names. I copied an example with four files.
oldnames<-c("/Users/foo/Documents/pictures/Test/2020-04-21 19.59.jpg",
"/Users/foo/Documents/pictures/Test/2020-04-21 19.59.35.jpg",
"/Users/foo/Documents/pictures/Test/2020-04-21 19.58.37.jpg",
"/Users/foo/Documents/pictures/Test/2020-04-21 17.21.06.jpg")
newnames<-c("/Users/foo/Documents/pictures/Test/2021-04-21 19.59.59.jpg",
"/Users/foo/Documents/pictures/Test/2021-04-21 19.59.35.jpg",
"/Users/foo/Documents/pictures/Test/2021-04-21 19.58.37.jpg",
"/Users/foo/Documents/pictures/Test/2021-04-21 17.21.06.jpg")
testnames_df<-data.frame(cbind(oldnames,newnames))
for (i in 1:4) {file.rename(from=testnames_df$oldnames[i], to = testnames_df$newnames[i])}
Related
I am trying to compare two .xlsx files. What I am looking to do is basically the following:
Does any cell in column B of file1 exist in column B of file2?
If yes, continue.
Else, add the row to file2
The structure of the files is different, so I would need to organize the information being added to file2 to match the format, also, but I think I would be able to do that myself once I know how to do the transfer.
The files are basically a vulnerability export from ACAS and a POA&M. I want to add any existing vulnerabilities from the export that are not already represented on the POA&M.
I've been tasked with reformatting a number of records in a spreadsheet to conform to a unified standard. We have a column containing a large amount of text, along with HTML tags, but I only need to target the tags. Our src paths merely need to be capitalized, but not the entire path. However, the paths all follow this general format.
(something)/custom_design/directory/(more directories)/imageName.jpg
I only need to capitalize /custom_design/directory/(more directories)/. I'll remove the (something) at the beginning of the src path later. Due to the enormous size of this file and the lack of a unified file structure (some image paths use img, others use images, etc.), it would be extremely time-consuming to go through each and every cell in that column and manually change the paths. Is there a faster approach to capitalizing these file paths? Find and replace only goes so far when you don't know the specific directories.
I should mention that the reason I want to target these specific strings, rather than the entire cell's contents, is because these cells are filled with a lot of other descriptive text that shouldn't be completely capitalized.
This is a partial solution for excel. You can use the logic this equation is using to Substitute text by finding their location as determined by back slashes in text (/). The equation is combination of Substitute, Left, Right, and Find.
When your original string is in A1.....
B1 = SUBSTITUTE(A1,RIGHT(LEFT(A1,FIND("/",A1,FIND("/",A1)+1)-1),FIND("/",A1)),UPPER(RIGHT(LEFT(A1,FIND("/",A1,FIND("/",A1)+1)-1),FIND("/",A1))))
I moved the cells over from A:B to G:H to limit size of photo. You can deconstruct this logic to isolate the strings you want. It's not pretty, but this is the only way I personally know how to do this in Excel.
I am using Stata 14 and have a dataset which contains a large group of variables:
court_date1 court_date2 court_date3
I would like to change part of each variable name while keeping the number at the end:
court_event1 court_event2 court_event3
Is there a way to do so as a group using the wildcard (*)? They are numbered consecutively, but are not listed consecutively in the dataset.
rename (*date*) (*event*)
works with just the names you give. If that catches too much, then
rename (court_date*) (court_event*)
See help rename groups, including the dryrun option.
Want some advice ... Trying to build an index or glossary of words for all text files
in a directory.
It needs to contain the word (special chars removed), followed by names of files for
EVERY occurrence.
I've started to use dictionaries, but do I add the references?
Some problem for a list of words approach?
A dictionary sounds like the way to go. Every word is a key, the value is the list of text files that hold this word.
A list is much slower then a dictionary if you are searching for a single occurance of a word. Especially since there will likely be a lot of words in your glossary, I would stick with a dictionary for speed.
After opening a big csv file in gvim, how can I know how many columns are within this file?
The csv.vim plugin provides a lot of functionality to work with CSV data. It includes a :NrColumns command.
A quick dirty hack would be to do something like:
:s/,//gn
Which would give you the number of commas on a single row. Add one and you have your number of columns (assuming no trailing comma, of course).
I say this is quick and dirty because it doesn't take into account quoted columns which can contain commas. I'm sure there might be a way to take that into account with a regex but it's probably not trivial.