What is the best Learning Rate Scheduler for your model?

What is the best Learning Rate Scheduler for your model? - conv-neural-network

My purpose in this discussion - Find the good method to choose best Learning Rate Scheduler in model.
I know No Meal For Free, continuously training, trying many Scheduler to find the last method with the best performance sometime is the last solution. But I hope you could share some your experiences, we can collect and use in future.
Firstly, I searched kernels that introducing some Learning Rate Scheduler (LRS)
https://www.kaggle.com/code/tolgadincer/tf-keras-learning-rate-schedulers/
https://www.kaggle.com/code/snnclsr/learning-rate-schedulers/
https://www.kaggle.com/code/isbhargav/guide-to-pytorch-learning-rate-scheduling
Secondly, I have some questions for beginning
What points do you think when choosing a LRS?
If we choose 1 in 2 LRS, what standard do you get result for the better one?

Related

Is my Statistical Treatment of Data Correct?

I am aware that consulting a statistician is not free and it is something I cannot afford, so I am trying my shot here. So for the problem at hand, I've already finished data gathering for my research and am now calculating the results. However, I am stuck on what should I use for my statistical treatment of data.
For background, I am using ISO 25010 to test my software quality and user acceptance. The questionnaire consists of a number of questions for each cluster (functionality, reliability, usability, efficiency, maintainability, and portability). I've also used Likert Scale: Agreement Type. The hypothesis of my research says "There is no significant difference in the user acceptance results in terms of [clusters]". As of now, I've used Descriptive Statistics, mean(for each question), average mean(ave. of mean for each cluster, and mode), for calculating the results.
I feel that the result I currently have might be lacking when the final defense came. As far as I know, using a combination of statistical methods is okay to give a more strong foundation for your result.
Based on the background of my research, what other statistical methods should I use?
I am thinking of sample standard deviation, but I don't know if I should compute it by questions or by cluster.
Sorry, statistics is not really my forte.
Thank you in advance for your answers

Finding probabilities of patterns in asset price movements based on multiple variables

I am seeking a method to allow me to analyse/search for patterns in asset price movements using 5 variables that move and change with price (from historical data).
I'd like to be able to assign a probability to a forecasted price move when for example, var1 and var2 do this and var3..5 do this, then price should do this with x amount of certainty.
Q1: Could someone point me in the right direction as to what framework / technique can help me achieve this?
Q2: Would this be a multivariate continuous random series analysis?
Q3: A Hidden Markov modelling?
Q4: Or perhaps is it a data-mining problem?
I'm looking for what rather then how.

One may opt to use Machine-Learning tools to build a learner to either
both classify of what kind the said "asset price movement" will beand serve also statistical probability measures for such a Classifier prediction
both regress a real target value, to which the asset price will moveandserve also statistical probability measures for such a Regressor prediction
A1: ( while StackOverflow strongly discourages users to ask about an opinion about a tool or a particular framework ) there would be not much damages or extra time to be spent, if one performs academia papers research and there would be quite a remarkable list of repeatedly used tools, used for ML in the context of academic R&D. For a reason, there would not be a surprise to meet scikit-learn ML-classes a lot, some other papers may work with R-based quantitative finance / statistical libraries. The tools, however, with all due respect, are not the core to answer all the doubts and inital confusion present in a mix of your questions. The subject confusion is.
A2: No, it would not. Well, unless you beat all the advanced quantitative research and happen to prove that the Market exhibits a random behaviour ( which it is not and for which it would be waste of time to re-cite remarkable research published about why it is not indeed a random process ).
A3: Do not try to jump on any wagon just because of it's attractive Tag or "contemporary popularity" in marketing minded texts. With all due respect, understanding HMM is outside of your sight while you now appear to move just to the nearest horizons to first understand what to look for.
A4: This is a nice proof of a missed target. Your question shows in this particular point better than in others, how small amount of own research efforts were put into covering the problem-domain and acquiring at least some elementary knowledge before typing the last two questions.
StackOverflow encourages users to ask high quality questions, so do not hesitate to re-edit your post to add some polishing efforts to this subject.
If in a need for an inspiration, try to review a nice and a powerful approach for a fast Machine Learning process, where both Classification and Regression tasks obtain also probability estimates for each predicted target value.
To have some idea about highly performant ML-predictors, these typically operate on much more than a set of 5 variables ( called in the ML-domain "features" ) . ( Think rather about some large hundreds to small thousands features, typically heavily non-linear transformations from the original TimeSeries' data ).
There you go, if indeed willing to master ML for algorithmic trading.
May like to read about a state-of-art research in this direction:
[1] Mondrian Forests: Efficient Online Random Forests
>>> arXiv:1406.2673v2 [stat.ML] 16 Feb 2015
[2] Mondrian Forests for Large-Scale Regression when Uncertainty Matters
>>> arXiv:1506.03805v4 [stat.ML] 27 May 2016 >>>
May also enjoy other posts on subject: >>> StackOverflow Algorithmic-Trading >>>

Scheduling Non - CPU resources

I have to write an essay on scheduling non-CPU resources. I can't find any information on what exactly this entails, could anybody give me an idea as to what this is.

That's actually anything that you can plan, for example train scheduling on the station (movement between tracks, arrivals, departures, etc) This particular example is actually a topic for roadef competition this year, you can read about it here
Transportation is actually often solved real-life scheduling problem that a lot of companies are facing.

How granular should tasks within a story be? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
We've been recently implementing Scrum and one of the things we often wonder is the granularity of tasks within stories.
A few people inside our company state that ideally those tasks should be very finely grained, that is, every little part that contributes to delivering a story should represent a task. They argument that this enables tracking on how we are performing in the current sprint.
That leads to a high number of tasks detailing many technical aspects and small actions that need to be done such as create a DAO for component X to persist in database.
I've also been reading Ken Schwaber and Mike Beedle's book, Agile Software Development with Scrum, and I've taken the understanding that tasks should really have this kind of granularity; in one of the chapters, they state that tasks should take between 4 to 16 hours to complete.
What I've noticed though, is that with such small tasks we often tend do overspecify things and when our solution differs from what we've previously established in our planning meetings we need to create many new tasks or replace the old ones. Team members also refrain from having to track each and every
thing they are doing inside the sprint and creating new tasks since that means we'll have to increment our total tasks in our burndown chart but not necessarily adding a task that aggregates value.
So, ideally, how granular should tasks be inside each story?

Schwaber and Beedle say "roughly four to sixteen hours."
The upper bound is useful. It forces the team to plan, and helps provide daily visibility of progress.
The lower bound is a useful target for most tasks, to avoid the fragility and costs of overspecification. However, occasionally the team may find shorter tasks useful in planning, and is free to include those. There should be no mandated lower bound.
For example, one of our current stories includes a task to send something to another team -- a task that will take 0 hours, but one we want to remember to finish.
The number of tasks in your burndown chart is irrelevant. It's the remaining time that matters. The team should feel free to change the tasks during the sprint, as Schwaber and Beedle note.

On my last assignment we had between 4 and 32 hours per task. We discovered that when we estimated tasks to more than ~32 hours it was because we did not understand what and how to do the task during estimation.
The effect was that the actual implementation time of those tasks varied much more than smaller task. We often also got "stuck" on those tasks or picked the wrong path or had misunderstood the requirements.
Later we learned that when estimated tasks to be that long it was a signal to try to break it down more. If that was not possible we rejected the task and sent it back for further investigation.
Edit
It also gives a nice feeling to complete tasks at least a couple of times a week.
It also gives rather fast feedback when something does not go as planned. If someone did not complete an 8h task in two days we discussed if the person was stuck on some part, if somebody else had some ideas how to progress or if the estimate was simply wrong from the beginning.

Tasks should probably take one-half day to a day, maybe as much as two days sometimes.
Think about it this way: on a more macro level, short iterations promote agility by creating small amounts of value quickly and allowing plans to change as business needs change. On a more micro level, the same is true for tasks. Just like you don't want to spend 3 months on a single iteration, you don't want to spend a week on a single task.
Daily standup meetings can give you a clue that your task size is too big. If team members frequently answer "What did you do yesterday?" and "What will you do today?" with the same answer that they gave the day before, your tasks are probably not small enough.
An example of that would be if a team member regularly answers: "I worked on BigComplexFeatureObject today and will work on it tomorrow" for more than one day in a row, that's a clue that your tasks may be too big. Hopefully, the majority of days a team member will report having completed one task and be about to start another.
Short tasks, 4-16 hours as others have said, also give the PO and team good feedback about project progress. And they prevent team members from going down "rabbit trails" and spending a lot of effort on work that might not be needed if business desires change.
A nice thing about having many smaller tasks is that it potentially gives the PO room to prioritize tasks better and optimize delivered value. You'd be surprised how many "important" parts of big tasks can be postponed or eliminated if they are their own small task.

Generally a good yardstick is that a task is something you do on a given day. This is ideal, which means it's rare. But it does fit nicely into that 4-16 hour estimate (some take half a day, some take two days, etc.) that you gave. Granted, I don't think I've ever spent an entire uninterrupted day on a single task. At the very least, you have to break for the scrum meeting. (At a previous job a day of coding was considered 6 hours to account for overhead.)
I can understand the temptation of management to want to plan every single granular detail. That way they can micro-manage every aspect of it. But in practice that just doesn't work. They may also think that they can then use the task descriptions to somehow generate detailed documentation about the software, essentially skipping that as an actual task itself. Again, doesn't work in reality.
Agile development does call for small work items, but taking it too far defeats the purpose entirely. It ends up becoming a problem of too much up-front planning and having to put in a ton of extra re-planning any time anything changes. At that point it's no longer agile, it's just a series of smaller waterfalls.

I don't think that there is a universal answer to this question that fits every situation. I think that you should try what your collegues are proposing, and after the first sprint or two you evaluate and see if the process needs tweaking to accomodate everyones needs and wishes.

That 4 hour figure sounds like a good minimum to me. I like to think in terms of visible results. We don't have a task per line of code, or a label on a screen, or per refactored utility method surely? But when we get to something that someone else can use, like a public class used by someone else, or a set of fields on a screen that allow some useful action then this sounds like a trackable task to me.
For me the key question is "Do we know we've finished it?" with individual helper functions there's a pretty good chance of refactoring and change, but when I say to my colleage "Here, use this" it either works or it doesn't. The task's completeness can be evaluated.

with agile estimating, is it true some say to choose intervals like 1/2 to 1.5 days only? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 5 years ago.
Improve this question
with agile estimating, is it true some say to choose intervals like 1/2 to 1.5 days only?

It tends to be a good rule of thumb (agile or not) that your tasks should be broken down into at most 1 - 2 day increments.
The idea is that if you have larger chunks than that then you haven't broken the task down enough and you will more likely miss the estimate and miss it by larger amounts of time than had you broke it down. Often when you break it down you discover your initial estimate was off and since you have broken the task down into more concrete tasks your estimate is now more accurate, more trackable and meaningful.
For tasks that are coming up on your to do list soon you should pay attention to this but for long range planning where you haven't necessarily thought out the feature in detail I think larger estimates / tasks not broken out for the feature is OK.
Here's a link to Joel Spolsky talking about this. Take a look at item #5 about half way down the page.
http://www.joelonsoftware.com/articles/fog0000000245.html

In my experience, any estimate that's longer than 2 days is probably hiding serious work that should be broken down further. Such estimates have a very high probability of going over. Try to break everything down into smaller chunks so that no individual chunk costs more than 1-2 days.

There are advantages to keeping the estimates short. It forces you to break up large tasks into small, discrete tasks that can be measured and discussed quickly, which helps promote the entire Agile development process.
That being said, I almost never keep a "rule" as a hard and fast rule with things like this. I'd say this is a good guideline, however.

My team consists of junior programmers (university students) and we've found that it's generally easier if we break all the large tasks down into a bunch of smaller ones. It involves more forward-thinking but in the end we are more productive and can it's easier to evaluate our progress. It also brings a sense of achievement when you have something completed at the end of the day.

I would agree with that guideline. Anytime I have ever taken on a 5 day task, it has degenerated to a three week nightmare. Large estimates indicate you didn't learn enough about the problem up front to know what is involved, because if you had, you could have found ways to break it up better.

I don't agree. If a team's iterations are two week long, the 10 days mean that 1 day would be spent for iteration close (show & tell), iteration planning and tasking or planning poker.
When playing planning poker, a team either geometric or Fibonacci progressions for estimates. For example, cards would contain values such as 1, 2, 4, 8, 16 or 1, 2, 3, 5, 8, 13. Each number reflects the number of days of development for a pair of programmers.
For each card, once discussion has occurred, each member simultaneously plays the card that reflects their estimate. If the majority of the team converges on the same estimate, the estimate is accepted. If there is much variation in the estimates, further discussion occurs (members explain the reason for their estimates) and another round of voting takes place. This occurs until consensus is reached.
If a number greater than 8 is picked, then the card is deemed to be too big and the card is refactored into at least 2 smaller cards. The reason being is that such a large estimate indicates the card is too big to be completed in a single iteration and any estimate is very likely to be inaccurate.
Using such a method brings commitment from the team members to delivery all they have committed to and for a new team the estimate become so accurate that carry over of cards soon become a low risk.

A very good post about agile estimation and planning you can find on the blog of agile42: Just enough, just in time

A lot of good answers here, so I'll play devil's advocate and approach it from a different side.
There's a possible problem with breaking down things into very small estimates (# of hours) when doing things such as release planning. David Anderson discusses it in his (excellent) book Agile Management for Software Engineering.
Basically, the idea is that with a task that is very small, a developer will pad his estimate by a fair bit (say, turning a half hour into an hour, or doubling it) because of a certain amount of ego that would be bruised if the developer failed to complete such a small task in the estimated time. These local buffers add up quite a bit and result in a global buffer that's far bigger than it needs to be.
This is less of a problem if you stick with .5 days as a min - it's basically assumed that there's some buffer in there, so you won't need to pad it any more.

I feel there is a bit of mix of information and overlapping in this thread... allow me to make my point :-)
1) The Fibonacci sequence, that is very much use through the Planning Poker technique from Mike Cohn, is about estimating "Complexity" of User Stories, which are - as Cam said - normally written on cards, and entail more than one task, at least all of those which will be needed to make a Story shippable (Ken Schwaber, Alistar Cockburn, Mike Cohn...)
2) The tasks that are included to complete a Story, are normally estimated in Ideal Hours or Pomodori (Francesco Cirillo, "The Pomodoro technique"). If you estimate in Ideal Hours normally the rule of thumb is to keep them between 1/2 day (3 ideal hours) and 2 days (12 ideal hours) of work. The reason for this is that doing so the team will have more qualitative status information by having at least every two days a team member reporting a Task as done, which is much more "valuable" than a 60% done. If you use Pomodori they are implicitly "timeboxed" to 25 min. each
The reason to keep tasks small comes basically from the "Empirical Process Control Theory" for which through transparency and regular inspection & adaption, you can better check the progress of your work, by quantifying it. The goal of having smaller tasks is to be able to clearly describe and envision in details what will be actually done, without adding too much of "guessing" given to the natural uncertainty deriving from having to predict "the future". Moreover defining an outcome and a shorter timebox allow people to keep the focus with enough "sense of urgency" to make it a challenging and motivating experience.
I would also catch up the point of the "motivation" and "ego" - from Chris - by adding that a good way to have people committed and motivated is to define the expected outcome of a task, so to be able to measure the results upon completion, and celebrate the success. This idea is encapsulated in the Pomodoro Technique, but can be achieved also using ideal hours for estimation. Another interesting part of the Pomodoro Technique is that "breaks" are considered "First Class Citizens" and planned regularly, which is very important especially in creative and brain intensive activities :-)
What do you think?
Best
ANdreaT

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string