Open multiple positions in Backtrader - python-3.x

Does someone know if it's possible to open multiple positions with only a single data feed? I am trying to do a second buy whilst in a position, which doesn't seem to be possible.
Nobody seems to adress this issue. Does anyone have any experience with Backtrader and have any input?

If you are just trying to buy more stock to add to your position, then yes, you should be able to do this and if you cannot recheck your strategy code in next.
If you are trying to track two separate positions of the same data...
One cannot have two separate positions in the same data feed. You may trade additional positions if you like but they will be combined in Backtrader. Even if you use two strategies you will still have one combined broker.
The reason for this is to simulate as near as possible real world conditions. If you have a brokerage account you most likely would have just one postion. (I know there are exceptions)
One solution would be to track your trading manually in a dictionary trades that result from different signals/sub-strategies. It's a bit more tedious to develop but very doable.

Related

No depot VRP - roadside assistance

I am researching a problem that is pretty unique.
Imagine a roadside assistance company that wants to dynamically route its vehicles. Hence for each packet of new incidents wants to create routes that will satisfy them, according to some constraints (time constraints, road accessibility, vehicle - incident matching).
The company has an heterogeneous fleet of vehicle (motorbikes for easy cases, up to tow trucks for the hard cases) and each incident states it's uniqueness (we know if it wants just fuel, or needs towing).
There is no depot, only the vehicles roaming on the streets.
The objective is to dynamically create routes on the way, having in mind the minimization of time and the total traveled distance.
Have you ever met such a problem? Do you have any idea in which VRP variant it belongs?
I have seen two previous questions but unfortunately they don't fit with my problem.
The respected optaplanner - VRP but with no depot and Does optaplanner out of box support VRP with multiple trips and no depot, which are both open VRPs.
Unfortunately I don't have code right now, as I am still modelling the way I will approach this problem.
I am really sorry for creating a suggestion question and not a real one.
Thank you so much in advance.
It's a rich dynamic/realtime vehicle routing problem. You won't find an exact name for your problem, as when VRPs get too complex they don't fit inside any of the standard categories.
It's clearly a dynamic/realtime problem (the terms are used interchangeably) as you would typically only find out about roadside breakdowns at short notice.
Sometimes you're servicing a broken down car, which would be a single stop (so a vehicle routing problem). Sometimes you're towing a car, which would be a pick-up delivery problem. So you have a mix of both together.
You would want to get to the broken down vehicles ASAP and some would need fixing sooner than others (think a car broken down in a dangerous position on a motorway). You would therefore need soft time windows so you can penalise lateness instead of the standard hard time windows supported in most VRP formulations.
Also for you to be able to scale to larger problems, you need an incremental optimiser that can restart from the previous (possibly now infeasible) solution when new jobs are added, vehicle positions are changed etc. This isn't supported out of the box in the open source solvers I know of.
We developed a commercial engine which does the above. We started off using the jsprit library, which supports mixing single stop and pickup delivery problems together. We later had to replace jsprit due to the amount of code we had to override to get it running happily for realtime problems, however jsprit may still prove a useful starting point for you. We discuss some of the early technical obstacles we had to overcome in getting jsprit to handle realtime problems in this white paper.

data and structure cleansing of excel sheets

I have over 6,000 excel sheets. While all the sheets describe the same thing, they are independently formatted. They all have between 9 and 13 columns, but they are out of order, the column names are independently misspelled, and they may or may not have a second, or third, column header.
I am currently trying in python to read cells in a left-down-right-up motion to attempt to locate the same data, but there is physically too many differences in structure names, column ordering, and data definitions to lock them in one a time. Is there a tool that I can use to read these documents and conform them to a single format, via a rapid mapping function?
Thanks much.
Thanks
Wow, it's the Ultimate Data Horror Story.
I want to ask how you ever let it get this way... but I actually don't want to know; I'm already going to have nightmares about this.
It's like that Hoarding show on TV, but with data.
No, I'm afraid that if you can't even identify a pattern then there's no magic function that will be able to either.
But that doesn't mean it's a lost cause. It's just going to need some human interaction, and there are ways to minimize the pain.
What you need is a custom interface that will load the documents one by one, and will walk a human through clicking each relevant column or area, and then automatically load the next document.
There would also need to be buttons for sorting things like obvious garbage sheets (blanks?), "unknowns" (that get put in a folder for advanced research later), and other "unpredictables" may come up during the process.
Also, perhaps once you get into it, you'll notice a pattern you're not thinking of, like maybe *"the person who handled the files from 2002 to 2004 set them up this way"*, or, "when Budget is misspelled, it's always either Bugdet or Budteg".
In this scope, little patterns like that can make a big difference.
Depending on your coding skills, you may or may not need outside assistance with this. I assume this is not data that can just get thrown out, or you wouldn't be asking...
If each document took an average of 20 seconds to process, that would be about 33 hours in total. An hour a day an it's done in a month. Or someone full-time, and it's done in a week.
Do you have a budget you can throw at this? Data archaeology is an actual thing! Hell, I'll do it for you for the right price... (wouldn't break the bank, depending on how urgent it is, of course!)
Either way, this ain't going to be fun for "someone"...

Millions of rows in GUI

I would like to implement a GUI handling a huge number of rows and I need to use GTK in Linux.
I started having a look at GTKTreeView with lists but I don't think that adding millions of lines directly to that widget will help in having a GUI that doesn't slow the application.
Do you know whether there is a GTK widget already in place for this problem or do I have to handle my self the window frame that that must display those lines? Eventually I would write the data directly using GtkDrawingArea (essentially writing a new widget).
Any suggestion about any GTK topic or project I can look as starting point for my research?
As suggested in the comments, you can use the Cell Data Func, and get the displayed data under contro. But I have another idea: Millions of lines are much much more than any amount of information a human user can see and understand. So maybe a better, more usable and user-friendly solution, is to diaplay the data in a way the users can more easily navigate in it.
Imagine opening a huge hierarchy, scrolling down, and forgetting what were the top-level items you opened.
Example for a possible solution: Have a combo box which allows to choose some filter or category, and this can reduce the amount of data to a reasonable amount the user can more easily navigate and make a mental model of it if necessary.
NOTE: As far as I know, GtkTreeView doesn't support sorting/filtering and drag-n-drop at the same time, so if you want to use both features, I suggest you use the existing drag-n-drop functionality (otherwise very complicated to implement by hand) and implement your own sorting/filtering.

Dynamics CRM 2011 Import Data Duplication Rules

I have a requirement in which I need to import data from excel (CSV) to Dynamics CRM regularly.
Instead of using some simple Data Duplication Rules, I need to implement a point system to determine whether a data is considered duplicate or not.
Let me give an example. For example these are the particular rules for Import:
First Name, exact match, 10 pts
Last Name, exact match, 15 pts
Email, exact match, 20 pts
Mobile Phone, exact match, 5 pts
And then the Threshold value => 19 pts
Now, if a record have First Name and Last Name matched with an old record in the entity, the points will be 25 pts, which is higher than the threshold (19 pts), therefore the data is considered as Duplicate
If, for example, the particular record only have same First Name and Mobile Phone, the points will be 15 pts, which is lower than the threshold and thus considered as Non-Duplicate
What is the best approach to achieve this requirement? Is it possible to utilize the default functionality of Import Data in the MS CRM? Is there any 3rd party Add-on that answer my requirement above?
Thank you for all the help.
Updated
Hi Konrad, thank you for your suggestions, let me elaborate here:
Excel. You could filter out the data using Excel and then, once you've obtained a unique list, import it.
Nice one but I don't think it is really workable in my case, the data will be coming regularly from client in moderate numbers (hundreds to thousands). Typically client won't check about the duplication on the data.
Workflow. Run a process removing any instance calculated as a duplicate.
Workflow is a good idea, however since it is being processed asynchronously, my concern is the user in some cases may already do some update/changes to the data inserted, before the workflow finish working.. therefore creating some data inconsistency or at the very least confusing user experience
Plugin. On every creation of a new record, you'd check if it's to be regarded as duplicate-ish and cancel it's creation (or mark for removal).
I like this approach. So I just import like usual (for example, to contact entity), but I already have a plugin in place that getting triggered every time a record is created, the plugin will check whether the record is duplicat-ish or not and took necessary action.
I haven't been fiddling a lot with duplicate detection but looking at your criteria you might be able to make rules that match those, pretty much three rules to cover your cases, full name match, last name and mobile phone match and email match.
If you want to do the points system I haven't seen any out of the box components that solve this, however CRM Extensions have a product called Import Manager that might have that kind of duplicate detection. They claim to have customized duplicate checking. Might be worth asking them about this.
Otherwise it's custom coding that will solve this problem.
I can think of the following approaches to the task (depending on the number of records, repetitiveness of the import, automatization requirement etc.) they may be all good somehow. Would you care to elaborate on the current conditions?
Excel. You could filter out the data using Excel and then, once you've obtained a unique list, import it.
Plugin. On every creation of a new record, you'd check if it's to be regarded as duplicate-ish and cancel it's creation (or mark for removal).
Workflow. Run a process removing any instance calculated as a duplicate.
You also need to consider the implication of such elimination of data. There's a mathematical issue. Suppose that the uniqueness' radius (i.e. the threshold in this 1D case) is 3. Consider the following set of numbers (it's listed twice, just in different order).
1 3 5 7 -> 1 _ 5 _
3 1 5 7 -> _ 3 _ 7
Are you sure that's the intended result? Under some circumstances, you can even end up with sets of records of different sizes (only depending on the order). I'm a bit curious on why and how the setup came up.
Personally, I'd go with plugin, if the above is OK by you. If you need to make sure that some of the unique-ish elements never get omitted, you'd probably best of applying a test algorithm to a backup of the data. However, that may defeat it's purpose.
In fact, it sounds so interesting that I might create the solution for you (just to show it can be done) and blog about it. What's the dead-line?

Generic graphing and charting solutions

I'm looking for a generic charting solution, ideally not a hosted one that provides the following features:
Charting a tuple of values where the values are:
1) A service identifier (e.g. CPU usage)
2) A client identifier within that service (e.g. server IP)
3) A value
4) A timestamp with millisecond/second resolution.
Optional:
I'd like to also extend the concept of a client identifier further, taking the above example further, I'd like to store statistics for each core separately, so, another identifier would be Core 1/Core 2..
Now, to make sure I'm clearly stating my problem, I don't want a utility that collects these statistics. I'd like something that stores them, but, this is also not mandatory, I can always store them in MySQL, or such.
What I'm looking for is something that takes values such as these, and charts them nicely, in a multitude of ways (timelines, motion, and the usual ones [pie, bar..]). Essentially, a nice visualization package that allows me to make use of all this data. I'd be collecting data from multiple services, multiple applications, and the datapoints will be of varying resolution. Some of the data will include multiple layers of nesting, some none. (For example, CPU would go down to Server IP, CPU#, whereas memory would only be Server IP, but would include a different identifier, i.e free/used/cached as the "secondary' identifier. Something like average request latency might not have a secondary identifier at all, in the case of ping). What I'm trying to get across is that having multiple layers of identifiers would be great. To add one final example of where multiple identifiers would be great: adding an extra identifier on top of ip/cpu#, namely, process name. I think the advantages of that are obvious.
For some applications, we might collect data at a very narrow scope, focusing on every aspect, in other cases, it might be a more general statistic. When stuff goes wrong, both come in useful, the first to quickly say "something just went wrong", and the second to say "why?".
Further, it would be a nice thing if the charting application threw out "bad" values, that is, if for some reason our monitoring program started to throw values of 300% CPU used on a single core for 10 seconds, it'd be nice if the charts themselves didn't reflect it in the long run. Some sort of smoothing, maybe? This could obviously be done at the data-layer though, so its not a requirement at all.
Finally, comparing two points in time, or comparing two different client identifiers of the same service etc without too much effort would be great.
I'm not partial to any specific language, although I'd prefer something in (one of the following) PHP, Python, C/C++, C#, as these are languages I'm familiar with. It doesn't have to be open source, it doesn't have to be a library, I'm open to using whatever fits my purpose the best.
More of a P.S than a requirement: I'd like to have pretty charts that are easy for non-technical people to understand, and act upon too (and like looking at!).
I'm open to clarifying, and, in advance, thanks for your time!
I am pretty sure that protovis meets all your requirements. But it has a bit of a learning curve. You are meant to learn by examples, and there are plenty to work from. It makes some pretty nice graphs by default. Every value can be a function, so you can do things like get rid of your "Bad" values.

Resources