Blank excel spreadsheet when unzipping file - excel

I'm researching a project on software defined networking discussed on knowledgedefinednetworking.org and they provide several datasets. Two of the three datasets are unzipping just fine (100K.csk.gz & train.csv.gz), but benchmark.csv.gz unzips into a new spreadsheet but still uses 3.3GB of memory. I'm using WinZip to unzip the files and they're all going into the same folder, but only benchmark is coming back empty. Is this a common issue or is there something potentially wrong with the download of the file that causes it to unzip empty?

"Is this a common issue" <-- Simple answer : Yes, it is a coomn issue.
"or is there something potentially wrong with the download of the file that causes it to unzip empty?" <-- the download went ok.. I tried to save it as excel. went ok. The files (file1 file2) is not blank.
Note : try to use 7zip as your file uncompressor.
Hope that solves... (:

Related

Reading whole directory with Spark except one file

I have the following directory containing these CSV files:
/data/one.csv
/data/two.csv
/data/three.csv
/data/four.csv
If I want to read everything, I can simply do:
/data/*.csv
but I can not seem to read everything, except four.csv.
This:
/data/*[^four]*.csv
seemed to work but I think that if the list of files would be bigger, than this way of reading would probably be wrong (because of double wildcards).
Is there a good way to do this? I also am aware that:
/data/{one,two,three,^four}.csv
would solve this specific case, but I need the except method for future needs.
Thank you very much!
I am not 100% sure that this method will work, but you can try. You can use Bash/Python or whatever script to scan all the csv files in the folder, but not with the names four.csv.
The input for spark will be (assuming you have files: one.csv, two.csv, three.csv, four.csv, five.csv ,...up to n.csv files)
PathToFiles=[/data/one.csv, /data/two.csv, /data/three.csv, /data/five.csv, ..., /data/n.csv]
Then you can use (the code is in python)
filesRDD = spark.sparkContext.wholeTextFiles(",".join(PathToFiles))
I have wrote similar code in java, in my impression, it works and you can try.

cv2.VideoCapture directory of images

I though this would be simple, but i have been caught by the simplest of puzzles which i can't find the answer to anywhere,
I have some code which reads images and then OpenCV looks for differences.
I read files with the following command
vs = cv2.VideoCapture("/home/andrew/images/image_%6d.jpg")
and this work perfectly with images called image_000000.jpg image_000001.jpg
However i don't want to rename my images so i would like to read files called
MDAlarm_20180921-031140.jpg whcih contain the date then time.
What is the printf format for this ? as what ever I try it does not work i.e no files found or do the files need to start from 0 , so i need to append an index
starting at 000000?
Lastly once i have this working how can i tell which file is being processed ?
Many Thanks
Andrew

Cannot zip files with the same name?

I could not believe this: it seems that the zip specification does not allow two different files with the same file name going into one zip file.
In my case I use an external file to specify all the files I wanna zip.
This could look like this:
../Website1/favicon.ico
../Website2/favicon.ico
and there we are, that's not possible, despite keeping the directory structure. You would expect the name to be <../Website1/favicon.ico> rather than but that does not seem to be the case, I get:
"Invalid ZIP request (cannot repeat names in Zip file)"
with WinZip. I tried the same with 7Zip - same result.
Strangely googling did not show many hits that really fit but those I found seem to confirm my findings. That's hard to believe since this limitation is very severe. I actually struggle to understand why this did not hit me a couple of decades earlier.
Am I overlooking something very basic here?
To be precise:
Adding these two files:
C:\Temp\Website1\FavIcon
C:\Temp\Website2\FavIcon
results in a single file; the last Add wins...
This however:
Website1\FavIcon
Website2\FavIcon
results in a zip file that contains both files.

Deal with ZIP-Buffer in node.js

I am building the server part of a webapp, using node.js. This involves getting data from thetvdb.com (API documentation of thetvdb).
The data comes as a zip file. HTTP download is no problem, however, parsing the file is. I actually never save the file, but just keep it in memory, as suggested in How to download and unzip a zip file in memory in NodeJs?
I have a buffer with valid data (same data as when I download the file with browser/curl...). However, adm-zip (I also tired other zip libraries, some suggest invalid zip length) can't open it. It does not show an error, but the zipEntries in the end have length of 0.
When I write out the buffer to the filesystem and open it with gui or cli tools it works.
I can't give a direkt link to the file, as it would involve my API key, however I re-uploaded it here.
I think I might have an answer for you:
Don't rely on npm install. I just ran the example that you linked to with the zip file you provided, and I get an output of "0".
I saw a comment on that other StackOverflow page, saying that the version of adm-zip on npm is not up to date. I grabbed a fresh copy of adm-zip from github, overwrote the one in my node_modules folder and reran the example code and now get the following:
...
<Actor>
<id>237811</id>
<Image>actors/237811.jpg</Image>
<Name>Peter Pratt</Name>
<Role>The Master</Role>
<SortOrder>3</SortOrder>
</Actor>
<Actor>
<id>23780s/237811.jpg</Image>
Give that a shot!

Problems came up in the following areas during load: Table

I have generated an excel file from xml. But i can not open it with Excel. Excel gives the following error opening it:
Problems came up in the following areas during load:
Table
Then it shows a message that the log file corresponding the error can be found at : C:/Documents and Setting/myUserName/Local Settings/Temporary Internet Files/Content.MSO/xxxxx.log
But i can not find Content.MSO folder in my windows. I checked folder settings and made all folders visible but i still can not access this folder. So that i can not analise the log file.
how could i find the generated log file?
I found the problem without analising the log file. i stil can not access the log file in temporary internet files. But i realised that i put a string(non-number) characters on a number-styled cell in Excel xml. So if you having the similar issues about your Excel file generated from xml, then have a look at if your cell values are appopriate with your cell data type.
If you type or paste the path of the log file into Explorer or your text editor of choice, you may find that the folder does exist, despite being invisible.
In my case it was a <Row> with an incorrect ss:Index
I was using a template and the last row had a fixed Index=100. If the number of rows I added exceeded 100, this last row had a wrong index and excel threw the error without any other message or log (MacOSX, Excel 15.25.1). I wish they printed more informative error messages, what a waste of our time.
Excel 2016. My error message was "Worksheet Settings". Path was pointing to non-existing file.
My cause of the problem was ExpandedRowCount not big enough for number of rows in Worksheet. If you add rows in XML directly (i.e. on a machine where Excel is not installed), make sure to increment number of rows in ExpandedRowCount.
yes.Even i too faced the same problem and problem was with the data type of cells ofexcel generated using xslt
In addition to checking the data being used vs "Type" assigned, make sure that the list of characters that need to be encoded for XML are indeed encoded.
I had a system that appeared to be working, but then some user data including & and < was throwing this error.
If you're not sure what's going on with your file, try http://www.xmlvalidation.com/ - that helped be spot the issue in a large file immediately.
I used this function to fix it, modified from this post:
function xmlsafe($s) {
return str_replace(array('&','>','<','"'), array('&','>','<','"'), $s);
}
and then run echo xmlsafe($myvalue) where you were just echoing $myvalue in your script.
This seems to be more appropriate for XML than htmlentities() or other options built into PHP.
I had the same issue, and the answer was - type of Cell was Number and some values doesn't converts to this type on my backend.
I had the SAME problem,
and its because de file is TOO BIG.
I try an extract from SAP, more little than the one with that make the error) and save it in XML file. and it WORK, no more error.
so maybe if you can save in 2 Excel files XML instead of 1 it will be good ;)
ALicia

Resources