Downloading sources file to image directory - pharo

I am using PharoCloud to host a Pharo image for me. By default it downloads a ZIP of the image only to my appliance; this ZIP doesn't include the .sources file.
I am trying to manually download the sources file with ZnClient. The directory my image is located in is /mnt/upload/upload.140605183221.
This is the code I have
| aFileStream |
aFileStream := '/mnt/universe/upload/upload.140605183221/PharoV30.sources' asFileName writeStream.
aFileStream write: (ZnClient new get: 'http://files.pharo.org/sources/PharoV30.sources.zip').
aFileStream close.
I'm brand new to ZnClient; I don't know how to use it. What's wrong with my code?

You can do this:
'./PharoV30.sources' asFileReference
writeStreamDo: [ :stream |
stream write: (ZnClient new get: 'http://files.pharo.org/sources/PharoV30.sources') contents ].

Nearly right. You need to replace the message #asFileName with #asFileReference, since #asFileName will answer a string object (so you actually get a WriteStream on the string).
fileReference := '/mnt/universe/upload/upload.140605183221/PharoV30.sources' asFileReference
fileReference writeStreamDo: [ :stream |
| url|
url := 'http://files.pharo.org/sources/PharoV30.sources.zip'.
stream write: (ZnClient new get: url) ]

Related

Django. TemporaryUploadedFile

I upload a file through the form, check it, and only after checking it I want to add it to my database.
form = BookForm(request.POST, request.FILES)
file = form.files
path = file.get('book_file').temporary_file_path()
in path - '/tmp/tmpbp4klqtw.upload.pdf'
But as soon as I want to transfer this file from the temporary storage to some other folder, I get the following error:
path = os.replace(path, settings.MEDIA_ROOT)
IsADirectoryError: [Errno 21] Is a directory: '/tmp/tmpbp4klqtw.upload.pdf' -> '/home/oem/bla/bla'
Can't understand why this file is not in reality? What can I do about it? Is it possible to set some special path for the "temporary file"?
UPD:
You should use path = os.replace(path, settings.MEDIA_ROOT + '/name-of-file.pdf') – Willem Van Onsem
os.replace(…) [python-doc] expects a filename as target if you specify a file as source, so you can move this to:
os.replace(path, f'{settings.MEDIA_ROOT}/name-of-file.pdf')
you can also make use of shutil.move(…) [python-doc] to specify the directory, this function will also return the filepath of the target file:
from shutil import move
target_file = move(path, settings.MEDIA_ROOT)

Creating zip files in Pharo

How can one create zip files in Pharo and write them to disk?
I thought I could create the structure in a MemoryStore and then zip it as following
root := FileSystem memory root ensureCreateDirectory.
root / 'sample.txt' writeStreamDo: [ :stream | stream << String loremIpsum ].
archive := ZipArchive new.
root allChildren select: #isFile thenDo: [ :each |
each readStreamDo: [ :readStream |
archive addString: readStream contents as: each fullName
]
].
archive writeTo: someFinalWriteStream.
archive close.
But apparently data added via addString:as: is not compressed at all.
Likewise addFile:/addFile:as: is not usable, because the implementation works only with real disk, not memory one.
So is there some other approach (or external library) with which I can zip my data without without having to dump all files to disk, zipping it there and reading the final file back?
Try something on the lines of:
string := String new: 14 withAll: $1.
archive := ZipArchive new.
member := zip addDeflateString: string as: 'filename.ext'.
member instVarNamed: 'compressionMethod' put: 8.
member rewindData.
zip := ByteArray streamContents: [:strm | member compressDataTo: strm].
Note that I've forced the setting of compressionMethod, there must be a better way to do this, and I'm sure you will find it ;)

How to convert mongoDB data into arff file

I have different data set from users through different forms. I am using MEAN stack, mongoose and node-weka to analyze stored data, but before that, weka uses arff files, that is why I have to convert data stored in mongoDB into ARFF file. does anyone know how to do it? I am a beginner and I haven't found the right documents.
Here is the beginning of the code in NODE JS
var data = ... //ARFF json format
var options = {
//'classifier': 'weka.classifiers.bayes.NaiveBayes',
'classifier': 'weka.classifiers.functions.SMO',
'params' : ''
};
var testData = {
outlook : 'sunny',
windy : 'TRUE'
};
weka.classify(data, testData, options, function (err, result) {
console.log(result); //{ predicted: 'yes', prediction: '1' }
I don't know about a pure-javascript solution, only about a command-line solution (for linux, unix and mac).
In any case, many of weka's classifiers indeed expect an arff file as input.
You can export your json data to csv, convert csv to .arff on-the-fly on the command line, and then pipe them to weka 3.6 (not 3.7).
You can use a bash script to convert csv to arff via a tempfile. This special-purpose script weka-cluster demonstrates. Adapt it to your needs.
#!/usr/bin/env bash
ALGO="$#"
IN=$(mktemp --tmpdir weka-cluster-XXXXXXXX).arff
finish () {
rm -f $IN
}
trap finish EXIT
csv2arff > $IN
weka filters.unsupervised.attribute.AddCluster -W "weka.${ALGO}" -i $IN -o /dev/stdout | arff2csv
call this script as
cat my.csv | weka-cluster clusterers.SimpleKMeans
you can extend this to mongodb, like this:
mymongoquery.sh | json2csv | (more optional filters.e.g csvcut) | weka-cluster clusterers.SimpleKMeans
These command-line tools (but not mongo) are described in more detail in the book "Data science the command line" by Jeroen Janssen. Check out the github repo for csv2arff, weka-cluster, and on how to install the other tools (csv2arff, arff2csv, csvcut, json2csv).

Corrupted Excel File & 7zip

I have a problem with a corrupted excel file. So far I have used 7zip to open it as an archive and extract most of the data. But some important sheets cannot be extracted.
Using the l command of 7zip I get the following output :
7z.exe l -slt "C:\Users\corrupted1.xlsm" xl/worksheets/sheet3.xml
Output:
Listing archive: C:\Users\corrupted1.xlsm
--
Path = C:\Users\corrupted1.xlsm
Type = zip
Physical Size = 11931916
----------
Path = xl\worksheets\sheet3.xml
Folder = -
Size = 57217
Packed Size = 12375
Modified = 1980-01-01 00:00:00
Created =
Accessed =
Attributes = .....
Encrypted = -
Comment =
CRC = 553C3C52
Method = Deflate
Host OS = FAT
Version = 20
However when trying to extract it (or test it for that matter) I get :
7z.exe t -slt "C:\Users\corrupted1.xlsm" xl/worksheets/sheet3.xml
Output:
Processing archive: C:\Users\corrupted1.xlsm
Testing xl\worksheets\sheet3.xml Unsupported Method
Sub items Errors: 1
The method listed above says Deflate, which is the same for all the worksheets.
Is there anything I can do? What kind of corruption is this? Is it the CRC? Can I ignore it somehow or something?
Please help!
Edit:
The following is the error when trying to extract or edit the xml file through 7zip:
Edit 2:
Tried with WinZip as well, getting :
Extracting to "C:\Users\axpavl\AppData\Local\Temp\wzf0b9\"
Use Path: yes Overlay Files: yes
Extracting xl\worksheets\sheet2.xml
Unable to find the local header for xl\worksheets\sheet2.xml.
Severe Error: Cannot find a local header.
This might help:
https://superuser.com/questions/145479/excel-edit-the-xml-inside-an-xlsx-file
and this on too: http://www.techrepublic.com/blog/tr-dojo/recover-data-from-a-damaged-office-file-with-the-help-of-7-zip/

Changing how nodejs require() fetches files

I'm looking to monkey-patch require() to replace its file loading with my own function. I imagine that internally require(module_id) does something like:
Convert module_id into a file path
Load the file path as a string
Compile the string into a module object and set up the various globals correctly
I'm looking to replace step 2 without reimplementing steps 1 + 3. Looking at the public API, there's require() which does 1 - 3, and require.resolve() which does 1. Is there a way to isolate step 2 from step 3?
I've looked at the source of require mocking tools such as mockery -- all they seem to be doing is replacing require() with a function that intercepts certain calls and returns a user-supplied object, and passes on other calls to the native require() function.
For context, I'm trying to write a function require_at_commit(module_id, git_commit_id), which loads a module and any of that module's requires as they were at the given commit.
I want this function because I want to be able to write certain functions that a) rely on various parts of my codebase, and b) are guaranteed to not change as I evolve my codebase. I want to "freeze" my code at various points in time, so thought this might be an easy way of avoiding having to package 20 copies of my codebase (an alternative would be to have "my_code_v1": "git://..." in my package.json, but I feel like that would be bloated and slow with 20 versions).
Update:
So the source code for module loading is here: https://github.com/joyent/node/blob/master/lib/module.js. Specifically, to do something like this you would need to reimplement Module._load, which is pretty straightforward. However, there's a bigger obstacle, which is that step 1, converting module_id into a file path, is actually harder than I thought, because resolveFilename needs to be able to call fs.exists() to know where to terminate its search... so I can't just substitute out individual files, I have to substitute entire directories, which means that it's probably easier just to export the entire git revision to a directory and point require() at that directory, as opposed to overriding require().
Update 2:
Ended up using a different approach altogether... see answer I added below
You can use the require.extensions mechanism. This is how the coffee-script coffee command can load .coffee files without ever writing .js files to disk.
Here's how it works:
https://github.com/jashkenas/coffee-script/blob/1.6.2/lib/coffee-script/coffee-script.js#L20
loadFile = function(module, filename) {
var raw, stripped;
raw = fs.readFileSync(filename, 'utf8');
stripped = raw.charCodeAt(0) === 0xFEFF ? raw.substring(1) : raw;
return module._compile(compile(stripped, {
filename: filename,
literate: helpers.isLiterate(filename)
}), filename);
};
if (require.extensions) {
_ref = ['.coffee', '.litcoffee', '.md', '.coffee.md'];
for (_i = 0, _len = _ref.length; _i < _len; _i++) {
ext = _ref[_i];
require.extensions[ext] = loadFile;
}
}
Basically, assuming your modules have a set of well-known extensions, you should be able to use this pattern of a function that takes the module and filename, does whatever loading/transforming you need, and then returns an object that is the module.
This may or may not be sufficient to do what you are asking, but honestly from your question it sounds like you are off in the weeds somewhere far from the rest of the programming world (don't take that harshly, it's just my initial reaction).
So rather than mess with the node require() module, what I ended up doing is archiving the given commit I need to a folder. My code looks something like this:
# commit_id is the commit we want
# (note that if we don't need the whole repository,
# we can pass "commit_id path_to_folder_we_need")
#
# path is the path to the file you want to require starting from the repository root
# (ie 'lib/module.coffee')
#
# cb is called with (err, loaded_module)
#
require_at_commit = (commit_id, path, cb) ->
dir = 'old_versions' #make sure this is in .gitignore!
dir += '/' + commit_id
do_require = -> cb null, require dir + '/' + path
if not fs.existsSync(dir)
fs.mkdirSync(dir)
cmd = 'git archive ' + commit_id + ' | tar -x -C ' + dir
child_process.exec cmd, (error) ->
if error
cb error
else
do_require()
else
do_require()

Resources