Linux kernel "historical" git repository with full history

Linux kernel "historical" git repository with full history - linux

I think many developers like to investigate sources with the help of git gui blame. As explained in the commit for Linux-2.6.12-rc2 (also mirrored at Github), it needs to have special historical Linux repository for this purpose.
Linux-2.6.12-rc2
Initial git repository build. I’m not bothering with the full history,
even though we have it. We can create a separate “historical” git
archive of that later if we want to, and in the meantime it’s about
3.2GB when imported into git — space that would just make the early
git days unnecessarily complicated, when we don’t have a lot of good
infrastructure for it.
Let it rip!
I have looked at a lot of the prepared historical repositories but I didn’t find one containing changes going back to version zero, so I gave up and am asking this question here.

Here is my setup.
I have a repository with a clone of the following remotes:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git
https://git.kernel.org/pub/scm/linux/kernel/git/davej/history.git
And the following grafts (info/grafts):
1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 e7e173af42dbf37b1d946f9ee00219cb3b2bea6a
7a2deb32924142696b8174cdf9b38cd72a11fc96 379a6be1eedb84ae0d476afbc4b4070383681178
With these grafts, I have an unbroken view of the kernel history since 0.01. The first graft glues together the very first release in Linus' repository with the corresponding release of tglx/history.git. The second graft glues together tglx/history.git and davej/history.git.
There are a few older versions missing, and the older versions have release granularity instead of patch granularity, but this is the best setup I know of.
Edit: Dave Jones pointed me to http://www.archive.org/details/git-history-of-linux, which seems to be exactly what you want.

Here is a review of available 2018 options with a focus on tag availability and date correctness.
https://archive.org/download/git-history-of-linux/full-history-linux.git.tar
Developed by Dave Jones, and made available on archive.org.
Covers early versions to 2010.
244,464 commits
Just 184 tags, covering versions in 2.6. The tags that should have been created for all versions seem to be missing.
Early commits have realistic dates, but incorrect times (11:00:00 199X -0600).
Some dates seem to be incorrect. For example, both 2.1.110 and 2.1.111 are dated Wed May 20 11:00:00 1998 -0600, although the latest file in the 2.1.111 snapshot is dated 1998-07-25 09:17.
The creation process is documented on GitHub and seems very thorough.
https://git.kernel.org/pub/scm/linux/kernel/git/tglx/history.git/
Created by Thomas Gleixner.
Covers 2.4.0 to 2.6.12-rc2.
Contains 170 tags covering 2.5.X and 2.6.X.
63,428 commits
Dates are correct.
Contains patches converted into commits.
https://github.com/mpe/linux-fullhistory
Created by Michael Ellerman, derived from work by Yoann Padioleau, based on historical trees reconstructed by Dave Jones and Thomas Gleixner, and Linus' mainline tree.
Covers full history
Provides only 558 tags, mostly starting at 2.0.0.
790,471 commits
Same issues with dates as in Dave Jones's repo.
Uses replace objects instead of grafts.
https://git.kernel.org/pub/scm/linux/kernel/git/history/history.git/
Owned by the Linux history team.
Covers early versions to 2.6.33-rc5.
1710 tags, starting with 0.10, covering most early versions.
244,774 commits
Most historic versions are incorrectly dated Fri Nov 23 15:09:04 2007 -0500.
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/
Modern Linux development.
Covers 2.6.12-rc2 (2005) until today
569 tags
777,419 commits (August 2018)
Proper commits

the referenced repos no longer exist. the new one is here:
https://git.kernel.org/cgit/linux/kernel/git/history/history.git/
if you're like me and want to keep some repos sep, you can leverage alternates with the graft to do so:
# Same dir as main linux
$ git clone --bare git://git.kernel.org/pub/scm/linux/kernel/git/history/history.git
$ cd linux/.git/
$ echo ../../../history.git/objects >> objects/info/alternates
$ echo 1da177e4c3f41524e886b7f1b8a0c1fc7321cac2 e7e173af42dbf37b1d946f9ee00219cb3b2bea6a >> info/grafts
$ echo 7a2deb32924142696b8174cdf9b38cd72a11fc96 379a6be1eedb84ae0d476afbc4b4070383681178 >> info/grafts

The best what I've found is git://git.kernel.org/pub/scm/linux/kernel/git/davej/history.git. History tracking there starts from Linux-0.01 but many comments are poor something like "Import 2.1.38pre1".
Anyway there is a lot of knowledge.
Thanks for help!

Related

Trouble organizing perforce streams to accommodate multiple main branches with streams

So we have a project where there are multiple "main" branches being worked on at once. So there is 1.0.0, 2.0.0, and 3.0.0. Things that go into 2.0.0 cannot go into 1.0.0, etc. Each branch gets merged forward, 1.0.0 > 2.0.0 > 3.0.0.
I don't think we can use the normal stream flow because if we setup release branches, you cant get feature branches off of them, plus these aren't "releases" just yet, they are still in active development. If we go below, then everything has to go through one main branch to get to releases and there's no way to segregate files.
So I guess my question is, is there a proper way to set up streams for something like this?
thanks

A lot of the assumptions around use of the mainline model come from an environment where releases are being cut relatively infrequently (twice a year), and only patched for critical bugs -- so changes that need to go from one release to another tend to be the exception rather than the rule. In this model, the vast majority of merges are simply from the newest release (e.g. while the release is being stabilized, which comes at a point in the cycle where there's very little activity in the release prior to that one) or from a dev branch back to the mainline, and from the mainline back to dev branches (since dev branches are mostly working on new features that are destined for the mainline but not any release branches). Changes only go from the mainline to a release branch if they're being manually cherry-picked to address a critical bug (which is rare), and never straight from a dev branch to a release branch. It's slightly awkward to cherry-pick a fix from an old release branch up to a bunch of later release branches, but it's very infrequent in this model so the awkwardness doesn't matter that much.
If you're very actively working on multiple releases simultaneously, the mainline model has less value since you need to either:
merge between sibling/cousin branches (which mostly works okay but can get awkward if there's been a lot of refactoring)
carefully cherry-pick changes to and from the mainline to enforce correct flow of change (which makes for cleaner merges but a bit more manual tracking)
The orthodox recommendation here would likely be to rethink your release methodology/policy to not require so much "waterfalling", but I'm assuming you have business reasons for requiring it to work this way. Given that constraint, I think you probably don't want to use the concept of different stream "types" and "flow" at all since those have assumptions about the mainline model built in, and what you're doing is fundamentally not a mainline development model.
To implement a non-mainline model in streams (which does still have some value without the flow-management guides since it'll help you manage your client views and whatnot) you'll probably want to use some combination of:
make every stream after the initial mainline a development stream (that's the most permissive I think)
set the mergeany option in every stream (that allows merging in all directions rather than trying to enforce "firmness" which is a mainline model concept)
use the -F option when merging to ignore flow direction (I think mergeany makes this unnecessary if you use it consistently)

npm how to commit package-lock.json if I did not use GitHub

I downloaded NodeJS and installed it on windows 10.
I updated npm using the npm install npm#latest -g command line
I neither used github nor anything else
I get the message "created a lockfile as package-lock.json. You should commit this file".
What should I do to commit the file?
what happens when I commit it?
what happens if I don't commit it?
Please do not quote the npm documentation, as I read it serveral times and did not undestand it.
Thanks

Thank you – jdubjdub & – SilverWolf - Reinstate Monica for your cogent remarks. Yours were the comments that led to my eventual comprehension; years after the fact for you, hours after the fact for me.
I know this is an old post and the OP and OC's are likely to never see this answer; I am dealing with this problem for the first time and have typed into the browser a reformulation of this very question many times today. I was near to trying the patience of some of the genuinely gifted people on Stack Overflow, attempting clarification for my own elementary version of the question.
I have finally come to recognise the term "commit" is not as alien as we thought. It is what we thought it might have been. It's the same as on github as when we commit anything that we have changed and want other contributors (or authors) to see.
I have a github account and git installed on my computer. After about an hour of docs and man pages I did not put all the dots together until I read the comments on this page and then it all finally just became so perfectly clear. Commit is the same concept on github as is being referenced in my terminal: npm notice created a lockfile as package-lock.json. You should commit this file. And since I am not actively working on a project I can disregard the warning. Also, some argue the merit of ignoring it even if I were on production. But that's beyond the scope of this post.
Those whom only rarely use git or gitHub, have only the vaguest grasp on how one performs version control. The term "commit", even when mentioned frequently in answer, does not necessarily obviate all but the one meaning or action.
Our friends may not fully appreciate our incomprehension of the term "commit" in this context. We were thinking (in my naivete, at least I was) that there may be more than the one meaning to "commit" we are familiar with i.e. from github.
I guess these are just such a common abstract for anyone doing version control they assume since we've gotten this far we must grasp the nomenclature and are really puzzled by the deeper fundamental aspects of the underlying processes and procedures.
If you would like a more robust definition on exactly how (and why) to commit check this out: github:https://help.github.com/en/desktop/contributing-to-projects/committing-and-reviewing-changes-to-your-project.
I should note there are many free and paid version control tools available. Here is not an exhaustive list containing 15: https://www.softwaretestinghelp.com/version-control-software/.
And here the github desktop app:
https://desktop.github.com/.

How to use gitversion on older codebase

I tried running gitVersion on extremely old and complicated codebase that was migrated with git-tfs. I do get results in most branches but they are incorrect.
How do I get started with existing repositories?
How do I debug the results so I know what contributed to the final output?
Can some branches be ignored because they used are incorrectly attributing a larger number?

Sorry to hear that you are having issues using GitVersion.
Not sure what you mean here, can you elaborate?
The GitVersion log output is quite verbose, as you can see here as an example. It walks you through what base versions it can locate, and from there, how it asserted the final version number.
You might want to take a look at the commits-before configuration option which might help with an older code base that wasn't sticking to conventions, and a result, ignore the prior commit history, and start versioning a fresh.

SVN for ClearCase user. How to use?

we used to work with IBM Rational ClearCase.
now we started a new project on linux (ubuntu) and for that we use SVN (tortoise).
i would like some help understanding it. for example: what is "making a baseline" in SVN? i don't understand all this versions numbers. we a a MAIN trunck/branch (which should be equivalent to a stream with a view on it in clear case) and under it everyone have their own branches (their own stream with thier own views on it). if i press the svn version tree i see a lot of number a cant relize where my branch came from.
thanks in advance :-)

As explained in What are the basic ClearCase concepts every developer should know?, the main difference you will find between ClearCase and most of the more recent VCS is:
ClearCase reasons file-by-file, and not on the repository level.
So when ClearCase makes a baseline, it actually takes all the latest versions of the files of a given component, and apply a label, for each file.
SVN will simply make an atomic operation, making a new revision of the repository with a new tag (which is actually a cheap copy in a tag "directory", like SVN branches: see "What do you use the svn tags directory for anyways?")
Note also that "baseline" in ClearCase refers to the UCM methodology, which is a complement to ClearCase, and which has no correspondance in UCM.
A baseline in ClearCase is for a "component", ie a specific subset of all the files of a VOB.
An SVN repo is just a massive centralized place where you can version any number of files you want. You can consider a specific directory of that SVN repo as a component (and "tag" just that), but that is entirely at your discretion: you won't "declare" a component in SVN first, before "baselining" it.

a good resource to get an overview is the SVN book:
http://svnbook.red-bean.com/
note: SVN does not use the term baseline. it uses tags instead, which is maybe a different approach to the baseline concept in ClearCase.

Perforce Howto? Syncing/Merging files between branches

(A) ------- (B) ----------- (C)
| | |
Trunk ReleaseBranch DeveloperBranch
Developers work in the C branch and check-in all the files. The modified files are then labeled in the C branch. The binaries that get deployed are built from B branch
and labeled. Currently all this is manual.
In Perforce, is there a simple way to accomplish this like merging Branches based on labels etc?

It's not immediately clear how much automation you already have, or how much automation you seek. Perforce itself provides the tools to keep track of integration and branching, but if you want to do things like automated builds and labeling you'll need to look outside the source code control world into the release management/automation world.
I'm going to assume you have two branches:
(B) //depot/yourcode/rel/...
(C) //depot/yourcode/dev/...
Inside these branches the code layout is roughly similar, though dev will be newer (and possibly buggier) than rel. (Your text doesn't explain what you're doing with trunk, so I'm ignoring it.)
Let's say you're devloping in dev and you want to release code. You create a label (let's call it MYCODE_DEV.1.0) with the files you want to release. You can integrate it into rel with:
p4 integrate //depot/yourcode/dev/...#MYCODE_DEV.1.0 //depot/yourcode/rel/...
That integrates from the MYCODE_DEV.1.0 label to the release branch. Perforce keeps track of which file revisions you've merged and which file revisions you haven't merged, so it'll only merge new code. If you've made changes to rel that weren't in dev, you'll need to resolve the changes (either automatically, or by hand). You can then check the changes into rel, create a new label, and release from there.
(Since Perforce keeps track of what you've merged, if you try to integrate the same label again, Perforce will politely decline to do anything, though you can override it if you think you know better.)
(If you read the Perforce documentation, you'll find references to "branch specs", which let you declare a named branch as a shorthand for specifying both the source and destination branches in your integration command. Branch specs are especially useful for maintaining complicated branches with source files scattered across multiple directories, but don't really add value to the simple example here.)
Perforce gives you the tools you need to set up your branches and releases to meet your goals, which can be easily scripted, but doesn't directly do automated releases.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string