I could not believe this: it seems that the zip specification does not allow two different files with the same file name going into one zip file.
In my case I use an external file to specify all the files I wanna zip.
This could look like this:
../Website1/favicon.ico
../Website2/favicon.ico
and there we are, that's not possible, despite keeping the directory structure. You would expect the name to be <../Website1/favicon.ico> rather than but that does not seem to be the case, I get:
"Invalid ZIP request (cannot repeat names in Zip file)"
with WinZip. I tried the same with 7Zip - same result.
Strangely googling did not show many hits that really fit but those I found seem to confirm my findings. That's hard to believe since this limitation is very severe. I actually struggle to understand why this did not hit me a couple of decades earlier.
Am I overlooking something very basic here?
To be precise:
Adding these two files:
C:\Temp\Website1\FavIcon
C:\Temp\Website2\FavIcon
results in a single file; the last Add wins...
This however:
Website1\FavIcon
Website2\FavIcon
results in a zip file that contains both files.
Related
I have the following directory containing these CSV files:
/data/one.csv
/data/two.csv
/data/three.csv
/data/four.csv
If I want to read everything, I can simply do:
/data/*.csv
but I can not seem to read everything, except four.csv.
This:
/data/*[^four]*.csv
seemed to work but I think that if the list of files would be bigger, than this way of reading would probably be wrong (because of double wildcards).
Is there a good way to do this? I also am aware that:
/data/{one,two,three,^four}.csv
would solve this specific case, but I need the except method for future needs.
Thank you very much!
I am not 100% sure that this method will work, but you can try. You can use Bash/Python or whatever script to scan all the csv files in the folder, but not with the names four.csv.
The input for spark will be (assuming you have files: one.csv, two.csv, three.csv, four.csv, five.csv ,...up to n.csv files)
PathToFiles=[/data/one.csv, /data/two.csv, /data/three.csv, /data/five.csv, ..., /data/n.csv]
Then you can use (the code is in python)
filesRDD = spark.sparkContext.wholeTextFiles(",".join(PathToFiles))
I have wrote similar code in java, in my impression, it works and you can try.
In my company
I need to organized all files and folder from ALL servers into a Text, basally Inside the Platform Smartsheet.
I was writing all the names of the files and folder by hand, then, I realized that exist so many files, like crazy, will never end write them all!
So I tried to find an easy solution, and I find here a code in Windows that help a lot:
Code: Tree /f /a >file.txt (I try to convert txt to CSV online, but if I changed the code to this in the end: >file.csv also works xD don't need to convert)
Anyway, I need a solution to have in that code the SIZE of all files and folders, the boss wanna know the size of the all files too.
Then, other problem as, If I import to Smartsheet / Excel the tree effect (this is important the tree) don't works very well, and I'm crazy to figure out a way to do something like this: The tree effect while importing to smartsheet / excel need automatic have a hide / show lines (with symbol + and -). Like this photos:
with + : https://ibb.co/cLjQLpF
with - : https://ibb.co/g3Y2pps
To explain better: + means Folder and I want all folder (+) in a tree.
If exist a program that do this automatic much better, but I don't find no one.
Thanks
regards
I though this would be simple, but i have been caught by the simplest of puzzles which i can't find the answer to anywhere,
I have some code which reads images and then OpenCV looks for differences.
I read files with the following command
vs = cv2.VideoCapture("/home/andrew/images/image_%6d.jpg")
and this work perfectly with images called image_000000.jpg image_000001.jpg
However i don't want to rename my images so i would like to read files called
MDAlarm_20180921-031140.jpg whcih contain the date then time.
What is the printf format for this ? as what ever I try it does not work i.e no files found or do the files need to start from 0 , so i need to append an index
starting at 000000?
Lastly once i have this working how can i tell which file is being processed ?
Many Thanks
Andrew
I'm researching a project on software defined networking discussed on knowledgedefinednetworking.org and they provide several datasets. Two of the three datasets are unzipping just fine (100K.csk.gz & train.csv.gz), but benchmark.csv.gz unzips into a new spreadsheet but still uses 3.3GB of memory. I'm using WinZip to unzip the files and they're all going into the same folder, but only benchmark is coming back empty. Is this a common issue or is there something potentially wrong with the download of the file that causes it to unzip empty?
"Is this a common issue" <-- Simple answer : Yes, it is a coomn issue.
"or is there something potentially wrong with the download of the file that causes it to unzip empty?" <-- the download went ok.. I tried to save it as excel. went ok. The files (file1 file2) is not blank.
Note : try to use 7zip as your file uncompressor.
Hope that solves... (:
I've written a small server function which is intended to tar together a bunch of locally downloaded files, then delete the originals. It looks something like this:
with tarfile.open(archive_filename, "w:gz") as tar:
for pb in designated_objects:
bucket.download_file(pb.key, pb.key)
tar.add(pb.key)
os.delete(pb.key)
My expectation is that this will generate a tarfile with all of my desired data and an otherwise empty directory. The idea here is that I would like to minimize my disc usage as much as possible. However, I'm unsure if deleting a file before the tarfile is finished being generated (as done here) is allowed.
Will this expression work as expected?
If it will not, is there something akin to an append mode that will?
As expected, the original files are populated, then deleted. However, the behavior of the archive is unusual. When this code block is run, no archive is generated. In fact, this code block will do nothing at all (except delete your files).
I find this behavior particularly unusual and surprising given the fact that taking a pass inside the with statement (as in the code that follows) will actually write an empty archive to disc. So in a sense, the given code block does even less than nothing!
with tarfile.open('archive_filename.xy.gz', "w:gz") as tar:
pass
For reference, this behavior is what I get with Python 3.6. Behavior with other versions of Python may differ.