How can I include fast.ai functionality when primarily using PyTorch? - pytorch

I am using PyTorch to carry out vision tasks, but would like to use some of what fast.ai provides since it has a lot of useful functionality. I'd prefer to work mostly in PyTorch since it's easier for me to understand what's going on, it's easier for me to find information on it online, and I want to maintain flexibility.
In https://docs.fast.ai/migrating_pytorch it's written that after I use the following imports: from fastai.vision.all import * and from migrating_pytorch import *, I should be able to start "Incrementally adding fastai goodness to your PyTorch models", which sounds great.
But when I run the second import I get ModuleNotFoundError: No module named 'migrating_pytorch'. Searching in https://github.com/fastai/fastai I also don't find any code mention of migrating_pytorch.py, nor did I manage to find something online.
(I'm using fast.ai version 2.3.1)
I'd like to know if this is indeed the way to go, and if so how to get it working. Or if there's a better way then how I should use that approach instead.
As an example, it would be nice if I could use the EarlyStoppingCallback, SaveModelCallback, and add some metrics from fast.ai instead of writing them myself, while still having everything in mostly "native" PyTorch.
Preferably the solution isn't specific to vision only, but that's my current need.

migrating_pytorch is an example script. It's in the fast.ai repo at: https://github.com/fastai/fastai/blob/master/nbs/examples/migrating_pytorch.py
The notebook that shows how to use it is at: https://github.com/fastai/fastai/blob/827e7cc0fad2db06c40df393c9569309377efac0/nbs/examples/migrating_pytorch.ipynb
For the callback example. Your training code would end up looking something like:
cbs = [EarlyStoppingCallback(), SaveModelCallback()]
learner = Learner(dls, simple_cnn(), loss_func=F.cross_entropy, cbs=cbs)
learner.fit(1)
Those two callbacks probably need some arguments, e.g. save path, etc.

Related

How to best find code implementations in existing python projects

Different people have told me that in order to improve my Python programming skills, it helps to go and look how existing projects are implemented. But I am struggeling a bit to navigate through the projects and find the parts of the code I'm interested in.
Let's say I'm using butter of the scipy.signal package, and I want to know how it is implemented, so I'm going to scipy's github repo and move to the signal folder. Now, where is the first place I should start looking for the implementation of butter?
I am also a bit confused about what a module/package/class/function is. Is scipy a module? Or a package? And then what is signal? Is there some kind of pattern like module.class.function? (Or another example: matplotlib.pyplot...)
It sounds like you have two questions here. First, how do you find where scipy.signal.butter is implemented? Second, what are the different hierarchical units of Python code (and how do they relate to that butter thing)?
The first one actually has an easy solution. If you follow the link you gave for the butter function, you will see a [source] link just to the right of the function signature. Clicking on that will take you directly to the source of the function in the github repository (pinned to the commit that matches the version of the docs you were reading, which is probably what you want). Not all API documentation will have that kind of link, but when it does it makes things really easy!
As for the second question, I'm not going to fully explain each level, but here are some broad strokes, starting with the most narrow way of organizing code and moving to the more broad ways.
Functions are reusable chunks of code that you can call from other code. Functions have a local namespace when they are running.
Classes are ways of organizing data together with one or more functions. Functions defined in classes are called methods (but not all functions need to be in a class). Classes have a class namespace, and each instance of a class also has its own instance namespace.
Modules are groups of code, often functions or methods (but sometimes other stuff like data too). Each module has a global namespace. Generally speaking, each .py file will create a module when it is loaded. One module can access another module by using an import statement.
Packages are a special kind of module that's defined by a folder foo/, rather than a foo.py file. This lets you organize whole groups of modules, rather than everything being at the same level. Packages can have further sub-packages (represented with nested folders like foo/bar/). In addition to the modules and subpackages that can be imported, a package will also have its own regular module namespace, which will be populated by running the foo/__init__.py file.
To bring this back around to your specific question, in your case, scipy is a top-level package, and scipy.signal is a sub-package within it. The name butter is a function, but it's actually defined in the scipy/signal/_filter_design.py file. You can access it directly from scipy.signal because scipy/signal/__init__.py imports it (and all the other names defined in its module) with from ._filter_design import * (see here).
The design of implementing something in an inner module and then importing it for use in the package's __init__.py file is a pretty common one. It helps modules that would be excessively large to be subdivided, for ease of their developers, while still having a single place to access a big chuck of the API. It is, however, very confusing to work out for yourself, so don't feel bad if you couldn't figure it out yourself. Sometimes you may need to search the repository to find the definition of something, even if you know where you're importing it from.

How can I see source code or explanation of "torch_sparse import SparseTensor"?

I am studying some source codes from PytorchGeometric.
Actually I am really finding from torch_sparse import SparseTensor in Google, to get how to use SparseTensor.
But there is nothing I can see explanation. I saw many documents about COO,CSR something like that, but how can I use SparseTensor?
I read : https://pytorch.org/docs/stable/sparse.html# but there is nothing like SparseTensor.
Thank you in advance :)
I just had the same problem and stumbled upon your question, so I will just detail what I did here, maybe it helps someone. I think the main confusion results from the naming of the package. SparseTensoris from torch_sparse, but you posted the documentation of torch.sparse. The first is an individual project in the pytorch ecosystem and a part of the foundation of PyTorch Geometric, but the latter is a submodule of the actual official PyTorch package.
So, looking at the right package (torch_sparse), there is not much information about how to use the SparseTensor class there (Link).
If we go to the source code on the other hand (Link) you can see that the class has a bunch of classmethods that you can use to genereate your own SparseTensor from well documented pytorch classes.
In my case, all I needed was a way to feed the RGCNConvLayer with just one Tensor including both the edges and edge types, so I put them together with the following line:
edge_index = SparseTensor.from_edge_index(edge_index, edge_types)
If you, however, already have a COO or CSR Tensor, you can use the appropriate classmethods instead.

Using Concert Technology application or Callable Library application with CVXPY

I am looking at a two step approach for a optimization problem. My first step is using a MILP formulation of the problem and the second step involves using the solution from the first step as an initial solution but now with a MIQP formulation. I have been able to apply this concept in MATLAB using CPLEX. However, I am now trying the same using CVXPY with CPLEX as the solver. Now I know about the warm_start option but this does not work with the CPLEX solver. I am able to set CPLEX parameters but I am not sure how to initialize my solution. I am thinking of setting the ADVANCE START SWITCH parameter for CPLEX to 1, but now I need to set the initial solution. According to this page: http://www-eio.upc.es/lceio/manuals/cplex-11/html/usrcplex/solveMIP17.html, I need to use the method setVectors in a Concert Technology application, or by using CPXcopymipstart in a Callable Library application to set the initial solution. I am unsure of how to use this along with CVXPY.
The functionality you are looking for does not currently exist in CVXPY. CVXPY is a generic modeling layer that wraps around several solvers and it does not expose the CPLEX-specific CPXreadcopymipstarts nor CPXaddmipstarts functionality.
The fact that setting the value property of variables and using the warm_start option, as suggested in this answer, doesn't work, is a CVXPY issue. It looks like there is an open github issue for this here. In the future, this will likely be the intended solution to your general question.
For now, you'll have to use one of the CPLEX APIs directly. As you mentioned in the comments of this related stackoverflow question, you do not like the idea of using the lower-level CPLEX Python API. That leaves you with docplex as a viable option.

Library to use for event extraction from text?

My project is to extract events automatically from a given text. The events can be written specifically, or just mentioned in a sentence, or not be there at all.
What is the library or technique that I should use for this? I tried the Stanford NER demo, but it gave bad results. I have enough time to explore and learn a library, so complexity is not a problem. Accuracy is a priority.
The task itself is a complex one, it is still not solve now(2018). But recently a very useful python library for nlp is emerging. It is Spacy, this lib has a relative higher performance than its competitors. With the library you can do things like tokenize,POS tagging,NER and sentence similarity。But you still need to utilize these features and extract events based on your specific rule.
Deeplearning may also be a way to try, but even though it works fine on other nlp tasks, it is still immature on event extraction.

How to implement and use the sub-modes feature from System.Console.CmdArgs

I would like to create a kind of swiss-knife tool for a specific domain, and a "cabal" or "darcs" command-line interface looks perfect.
Using the on-line tutorials I could implement a simple "hello, world" program. Then I implemented a more sophisticated solution with modes and all when well.
But now, I would like to explore the "sub-modes" to have a good understanding of all the possibilities, and I'm stuck. I could not find any tutorial, example or detailed description of the feature.
How to implement and use the submodes feature?
I want to clarify that I understand modes, but it's really the submodes that are not clear to me.
As mentioned above, CmdArgs: Easy Command Line Processing, linked from the project home page, is the place to start. It includes some examples; if they are unclear I'd fetch their full code and play around with it.
The also-mentioned search results include Haskell: Using CmdArgs (Single and Multi-Mode) and Building a Haskell CLI Utility with CmdArgs.
hledger's use of cmdargs is another example. It's a bit more complicated, allowing modes to be imported and reused across multiple executables.
The cmdargs tutorial has examples for sub-modes. The documentation for the modes function is also clear.
In fact, a Google search for "cmdargs modes" reveals quite a few more tutorials covering exactly this.

Resources