I deployed my project on Heroku with heroku/python as Buildpack then with github link from Learn more section of image as Buildpack. It is not working
with any of the buildpacks.
Please help me out
It seems you should create a nltk.txt file to download the corpora that you are interested in, as mentioned in the link.
In order to use it, you have to download corpora
and make it available to your application.
This is not required by NLTK, so they are simply letting you know that no corpora will be downloaded.
you can go to http://www.nltk.org/nltk_data/ to see a list of corpora available or in a python terminal run:
>>> import nltk
>>> nltk.download()
Then simply choose and install what you want.
Related
I am currently using a python script in my Azure pipeline
Import data as Dataframe --> Run Python Script --> Export Dataframe
My script is developed locally and I get import errors when trying to import tensorflow... No problem, guess I just have to add it to environment dependencies somewhere -- and it is here the documentation fails me. They seem to rely on the SDK without touching the GUI, but I am using the designer.
I have at this point already build some enviroments with the dependencies, but utilizing these environments on the run or script level is not obvious to me.
It seems trivial, so any help as to use modules is greatly appreciated.
To use the modules that are not preinstalled(see Preinstalled Python packages). You need to add the zipped file containing new Python packages on Script bundle. See below description in the document:
To include new Python packages or code, connect the zipped file that contains these custom resources to Script bundle port. Or if your script is larger than 16 KB, use the Script Bundle port to avoid errors like CommandLine exceeds the limit of 16597 characters.
Bundle the script and other custom resources to a zip file.
Upload the zip file as a File Dataset to the studio.
Drag the dataset module from the Datasets list in the left module pane in the designer authoring page.
Connect the dataset module to the Script Bundle port of Execute Python Script module.
Please check out document How to configure Execute Python Script.
For more information about how to prepare and upload these resources, see Unpack Zipped Data
You can also check out this similar thread.
I have a python package which depends on pytorch and which I’d like windows users to be able to install via pip (the specific package is: https://github.com/mindsdb/lightwood, but I don’t think this is very relevant to my question).
What are the best practices for going about this ?
Are there some project I could use as examples ?
It seems like the pypi hosted version of torch & torchvision aren’t windows compatible and the “getting started” section suggests installing from the custom pytorch repository, but beyond that I’m not sure what the ideal solution would be to incorporate this as part of a setup script.
What are the best practices for going about this ?
If your project depends on other projects that are not distributed through PyPI then you have to inform the users of your project one way or another. I recommend the following combination:
clearly specify (in your project's documentation pages, or in the project's long description, or in the README, or anything like this) which dependencies are not available through PyPI (and possibly the reason why, with the appropriate links) as well as the possible locations to get them from;
to facilitate the user experience, publish alongside your project a pre-prepared requirements.txt file with the appropriate --find-links options.
The reason why (or main reason, there are others), is that anyone using pip assumes that (by default) everything will be downloaded from PyPI and nowhere else. In other words anyone using pip puts some trust into pypi.org as a source for Python project distributions. If pip were suddenly to download artifacts from other sources, it would breach this trust. It should be the user's decision to download from other sources.
So you could provide in your project's documentation an example of requirements.txt file like the following:
# ...
torch===1.4.0 --find-links https://download.pytorch.org/whl/torch_stable.html
torchvision===0.5.0 --find-links https://download.pytorch.org/whl/torch_stable.html
# ...
Update
The best solution would be to help the maintainers of the projects in question to publish Windows wheels on PyPI directly:
https://github.com/pytorch/pytorch/issues/24310
https://github.com/pytorch/vision/issues/1774
https://pypi.org/help/#file-size-limit
I've been recently working with NLTK library for language processing. I can normally install packages using nltk.download('package'), if I have the internet access etc.
The problem arises, If I try to run my code offline on a cluster. Here,
from nltk.tag import PerceptronTagger
ImportError: cannot import name 'PerceptronTagger'
and similar errors emerge, as nltk can't seem to find the nltk_data folder. I tried:
nltk.data.path.append("./nltk_data"), where I copied nltk_data along with code.
nltk.download('punct') #,download_dir="./nltk_data"), but this doesn't work, as there is no internet access.
Question is then, how can I use nltk_data locally?
Thanks.
It appears the machine I was running this on had NLTK 3.0.2, hence updating NLTK solved the problem all together.
I am building a chat bot with rasa-nlu. I went through the tutorial and I have built a simple bot. However, I need lots of training data for building a chat bot that is able to book a taxi. So I need data to build a specific bot.
Is there a repository, or corpus, for booking a taxi?
Or is there a way to generate this kind of dataset?
This is a blog post from one of the founders of Rasa and I think it's got some really excellent advice. I think you're going about it the wrong way asking for a pre-built training set. Start it yourself, then add friends, etc until you've built a training set that works best for your bot.
Put on your robot costume
Beyond that the Rasa docs have this under improving model performance
When the rasa_nlu server is running, it keeps track of all the
predictions it’s made and saves these to a log file. By default log
files are placed in logs/. The files in this directory contain one
json object per line. You can fix any incorrect predictions and add
them to your training set to improve your parser.
I think you'll be surprised how far you can get with just the training set you can come up with yourself.
Good luck on finding the corpus, but either way hope these links and snippets helped.
One method of doing this is, head over to LUIS.AI
Login using Office 365, Make your own Taxi Booking App, by giving in Intents and Utterances like below:
Now after training and publishing the model, download the corpus like below:
Now, after downloading the corpus, it will look something like this:
Install RASA NLU, I have Windows 8.1 on my machine, so the steps are as follows:
These are the steps to configure RASA:
First install:
Anaconda 4.3.0 64-bit Windows for installing Python 3.6 interpreter: https://repo.continuum.io/archive/Anaconda3-4.3.0-Windows-x86_64.exe
&
Python Tools for Visual Studio 2015: https://ptvs.azureedge.net/download/PTVS%202.2.6%20VS%202015.msi
Next, install the following packages in this order in administrative mode in command prompt:
Spacy Machine Learning Package: pip install -U spacy
Spacy English Language Model: python -m spacy download en
Scikit Package: pip install -U scikit-learn
Numpy package for mathematical calculations: pip install -U numpy
Scipy Package: pip install -U scipy
Sklearn Package for Intent Recognition: pip install -U sklearn-crfsuite
NER Duckling for better Entity Recognition with Spacy: pip install -U duckling
RASA NLU: pip install -U rasa_nlu==0.10.4
After installing all the above packages successfully, make a spaCy configuration file which will be read by RASA, like as follows:
{
"project": "Travel",
"pipeline": "spacy_sklearn",
"language": "en",
"num_threads": 1,
"max_training_processes": 1,
"path": "C:\\Users\\Kunal\\Desktop\\RASA\\models",
"response_log": "C:\\Users\\Kunal\\Desktop\\RASA\\log",
"config": "C:\\Users\\Kunal\\Desktop\\RASA\\config_spacy.json",
"log_level": "INFO",
"port": 5000,
"data": "C:\\Users\\Kunal\\Desktop\\RASA\\data\\FlightBotFinal.json",
"emulate": "luis",
"spacy_model_name": "en",
"token": null,
"cors_origins": ["*"],
"aws_endpoint_url": null
}
Next, Make a directory structure like this:
data folder -> Will contain all LUIS formatted corpus
models -> Will contain all trained models
logs -> Will contain active learning logs and RASA framework logs
Like this,
Now, make batch file scripts for Training and Starting RASA NLU Server.
Make a TrainRASA.bat by Notepad or Visual Studio Code and write this:
python -m rasa_nlu.train -c config_spacy.json
pause
Now make a StartRASA.bat by Notepad or Visual Studio Code and write this:
python -m rasa_nlu.server -c config_spacy.json
pause
Now train and start RASA Server by clicking on the batch file scripts that you just now made.
Now, everything is ready, just fire up chrome and issue a HTTP GET request to your enpoint /parse
Like: http://localhost:5000/parse?q=&project=
You will get a JSON response that corresponds to LUISResult class of Bot Framework C#.
Now handle the business logic you want to perform after doing that.
Alternatively, You can take a look at RASA Core, it was mainly built for this purpose.
RASA Core, which uses machine learning to build dialogs instead of
simple if-else statements.
The below link contains datasets relevant for commercial chatbot applications ('human-machine' dialogues). It's a fairly comprehensive collection of both human-human and human-machine text dialogue datasets, as well as audio dialogue datasets. https://breakend.github.io/DialogDatasets/
We did face the same problem while trying to build a love relationship coach bot. Long story short, we decided to create a simple tool to collect data from our friends, our colleagues or people on Mechanical Turk: https://chatbotstrap.io.
The idea is to create polls like this one: https://chatbotstrap.io/en/project/q5pimyskbhna2rm?language=en&nb_scenarios=10
and send them to anyone you know. With that solution, we were able to build a dataset of more than 6000 sentences divided in 10 intents in a few days.
The tool is free as long as you agree that the dataset constructed with it can be opensourced. They are also payed plans if you prefer to be the sole beneficiary of the data you collect.
I am a newbie learning how to code in Swift on Linux.
Right now I am trying to use Perfect framework so I can create th REST service (or something like that) I am following instructions in this
https://videos.raywenderlich.com/courses/77-server-side-swift-with-perfect/lessons/1
video (I have found a link on perfect.org site) and I did everything just like in the video, but the problem occurs when I have to edit the main.swift file and use include to use the PerfectLib, PerfectHTTP and PerfectHTTPServer libraries/modules(?). When I run it the error shows on the terminal saying:
main.swift:1:8: error: no such module 'PerfectHTTP'
import PerfectHTTP
Same with other modules. Do I have to place downloaded files from Perfect to some special directory within swift directory? Or maybe the files in download link are not complete?
before doing any Server Side Swift, please temporarily forget Xcode and try a new tool chain called Swift Package Manager. Open a terminal in a blank folder and type swift package init then it will setup a blank project which contains a Package.swift, a folder named Sources and a Tests directory as well.
Now you have to edit the Package.swift before import anything into your source code. For example, the starter template Perfect server has such a Package.swift:
import PackageDescription
let package = Package(
name: "PerfectTemplate",
targets: [],
dependencies: [
.Package(url: "https://github.com/PerfectlySoft/Perfect-HTTPServer.git", majorVersion: 2)
]
)
Then you can import any Perfect libraries included in the Perfect-HTTPServer.git
Here is the importing part as defined in the main.swift of PerfectTemplate:
import PerfectLib
import PerfectHTTP
import PerfectHTTPServer
So I will suggest that the best practice is to try Perfect Assistant: https://assistant.perfect.org/perfectassistant/Perfect%20Assistant.dmg which can save most tricky operations such as dependency management, building on linux and production server deployment.
For more information about Perfect other than tutorial video, see this link: http://www.perfect.org/docs/