Creating a tag cloud from text:nsp output - text

I am using Text::NSP which creates n-gram from text files. Is it possible to create tag clouds from an output file of Text-NSP? I have used and liked IBM Word Cloud Generator which only gives a tag cloud output from the frequency of each word within a file. However, I am working with 2-grams and 3-grams. In short, I need a tag cloud generator which will accept an input file with words and their occurrence number. I am running on Debian.
Thanks all.

I started to use R snippets package which is what I was searching for.
The output of text::nsp should be changed with some bash scripts in order to obtain a dataframe acceptable by the R.

Related

Changing cisco ios description with python - excel input

What I'm going to do is writing an script with python to take an excel file as an input and then read the number and description of interfaces of a switch which is written in there , and then ssh to a cisco switch and change the description with the values added before in excel .
could any body give me a hint?
Try checking netmiko module. I was able to do something close to what you require using netmiko but now I use ansible ios_command which is a lot more easier for a non programmer network engineer.
Start with Paramiko or Netmiko , Netmiko is a bit better version. I would also just rethink about the actual project where instead of thinking about one switch think about all of them and see if you have some universal thing which you need to do in all of your switches instead of one.
For this project you could do below.
1 . save date in CSV
2 . Open CSV file
3. Create a dictionary and Save interface name as key , and description as values
4. Create a list where you can save all your keys --> l = d.keys()
4. SSH to the sw via paramiko/Netmiko .
5. Run a loop in the list l
on each iteration send below commands
interface l[i]
description d[l[i]]
this will translate to below
interface eth1/1
description d['eth1/1'] ( d['eth1/1'] will be value/description of whatever you are gonna get from CSV)
If you really try to learn python then its a good start however if you are on a time crunch Ansible is easier option

Azure Cognitives services speech to text recognition of numeric entities as text

I was wondering if its possible that the c++ sdk of Cognitives services Speech to text to return the numeric entities as text instead of numbers.
Current response 'I want to order 2 Cokes'
Expected response 'I want to order two Cokes'
Of course i can implement a feature to the translation. But i was wondering if its something that the service already provides. Particularly on spanish.
take a look at the sample repository at https://github.com/Azure-Samples/cognitive-services-speech-sdk
especially the file speech_recognition_samples.cpp , function SpeechRecognitionWithLanguageAndUsingDetailedOutputFormat
Enabling ‘detailed output’ will give you the result you want:
config->SetOutputFormat(OutputFormat::Detailed);
Then you need to look at the detailed output:
result->Properties.GetProperty(PropertyId::SpeechServiceResponse_JsonResult)
And that would create detailed output like this:
{"Duration":35500000,"NBest":[{"Confidence":0.7535948753356934,"Display":"I want to order 2 Cokes.","ITN":"I want to order 2 cokes","Lexical":"i want to order two cokes","MaskedITN":"I want to order 2 cokes"}],"Offset":17000000,"RecognitionStatus":"Success"}
The lexical output is probably what you want
Wolfgang

Get default stop word list in elastic search

I am trying to find out what the predefined stop word list for elastic search are, but i have found no documented read API for this.
So, i want to find the word lists for this predefined variables (_arabic_, _armenian_, _basque_, _brazilian_, _bulgarian_, _catalan_, _czech_, _danish_, _dutch_, _english_, _finnish_, _french_, _galician_, _german_, _greek_, _hindi_, _hungarian_, _indonesian_, _irish_, _italian_, _latvian_, _norwegian_, _persian_, _portuguese_, _romanian_, _russian_, _sorani_, _spanish_, _swedish_, _thai_, _turkish_)
I found the english stop word list in the documentation, but I want to check if it is the one my server really uses and also check the stop word lists for other languages.
The stop words used by the English Analyzer are the same as the ones defined in the Standard Analyzer, namely the ones you found in the documentation.
The stop word files for all other languages can be found in the Lucene repository in the analysis/common/src/resources/org/apache/lucene/analysis folder.

Labelling text using Notepad++ or any other tool

I have several .dat, containing information about hotel reviews as below
/*
<Author> simmotours
<Content> review......goes here
<Date>Nov 18, 2008
<No. Reader>-1
<No. Helpful>-1
<Overall>4`enter code here`
<Value>4
<Rooms>3
<Location>4
<Cleanliness>4
<Check in / front desk>4
<Service>4
<Business service>-1
*/
I want to classify the review into two pos and neg , i.e. have two folder pos and neg containing several files with reviews above 3 classified as positive and below 3 classified as negative.
How can I quickly and efficiently automate this process?
You could write up a python script to read the overall score. Do this by looping over the the lines using readline() See here. Find the "Overall" Score using some string parsing. Then move the file into the right directory. All very simple things to do in Python, just break it down into steps and search for answers to those steps.
Notepad++ can do replacements with regular expressions. And allows the definition of macros. Use them to convert the file to an XML file. Check out the help file.
Then you can read it with any scripting language and do what you want.
Alternatively you could change the file to a form where you can load it into Excel and do the analysis there.

How to concatenate SVG files lengthwise from linux command line?

I have a series of square SVG files that I would like to arrange lengthwise into one super long SVG file.
I attempted to use imagemagick to combine them. Based on this page:
http://linux.about.com/library/cmd/blcmdl1_ImageMagick.htm
and this
http://www.imagemagick.org/Usage/compose/
I tried this command
composite 'file1.svg' 'file2.svg' +adjoin 'outputfile.svg'
However, I received the following error message:
composite: unrecognized option '+adjoin' # error/composite.c/CompositeImageCommand/565.
I tried several other imagemagick commands (convert, display), but had no success. How can I combine these files on the command line? Is there an inkscape command that does this?
There's currently no convenient way to do this with only the command line and no custom scripting.
Closest pre-written thing I could find currently (4-16-2012) is https://github.com/astraw/svg_stack, which lets you write commands of the form:
svg_stack.py --direction=h --margin=100 red_ball.svg blue_triangle.svg > shapes.svg
to concatenate.
It should be pretty easy if you're willing to use a scripting language. For each file, just add a prefix to all id tags; so in file 1, id="circle" becomes id="file1_circle", and in file 2, id="circle" becomes id="file2_circle".
In most cases you would get away with a trivial search and replace (find id=" and replace it with id="fileX_) although it is possible to have cases where this won't work (specifically if that find string appears in an item of text, for example).
If you want to do this 'the proper way', you'll need an XML parser (such as XMLReader in PHP).

Resources