'Connect git repository with unrelated repository
I am building theme for Shopify on existing Shopify "base" theme, called Dawn. Here is the git repository for it: Dawn. The thing is, that I downloaded this theme with Shopify cli (init command), which does not create a local repository, but just copies Dawn code: Cli. I have since made multiple code modifications and only now realized, that I don't have connection to original Dawn repository, which would allow me to pull official updates of original theme. What I should do in first place I think, is fork original repository. What are my options here, to integrate my modificated repository in original and use it in the future? Oh, btw, I have my own repository for tracking my own changes.
Solution 1:[1]
You can clone the original repository, then copy your existing commits, from your existing unrelated repository, to new and improved commits that add on to the clone. You can't quite use the original commits as-is because they're not related to the other commits.
The way to grok this is to begin with the understanding that a Git repository is merely a collection of commits (plus some other stuff that helps you work with those commits, but that's of no importance at all unless you plan to get some work done ?). It's the commit that is the be-all and end-all of Git. A commit, besides being uniquely numbered with a hash ID or object ID, contains two parts: data—a source snapshot—and metadata. The metadata for each commit, treated as information, includes a list of previous commit hash IDs, so the commits are deeply intertwingled (though not as deeply as we might sometimes like as the links go only one way: backwards through history, so that the commits are the history).
As such, by creating your own initially-empty repository and adding commits, you've created history that has nothing at all to do with the history in the original repository. To fix this, you must splice the two histories.
There is more than one way to do that, and the "easy" way—with easy in quotes because it may well be quite hard—is to use git merge --allow-unrelated-histories:
git clone <repo1-1-url>
cd <repo>
git remote add repo2 <repo-2-url>
git fetch repo2
git switch <appropriate-branch-name>
git merge --allow-unrelated-histories repo2/<appropriate-name>
The problem with this approach is that you end up with a commit graph that looks like this (simplified):
A--B--C--D--E <-- origin/main # from repo1
\
M <-- new-branch # in merged repo
/
F--G--H--I--J <-- repo2/main
This is a true and accurate accounting of how you got to where you were when you discovered that you were in a mess, but you might prefer to have a fictitious accounting that looks like you never made any mistake in the first place:
A--B--C--D--E <-- origin/main # from repo1
\
G'-H'-I'-J' <-- main # we started from C
where your commit F more or less matches their commit C—either exactly, or so close that it's closer than any other commit in their repository, and you don't mind the fiction that pretends you started with their commit C rather than your commit F.
Your new-and-improved commit G' is an exact copy of your original commit G except that its parent is C, not F. Your new-and-improved H' is a duplicate of your H except that its parent is G'. This repeats for all your commits: they're all exactly as you made them, except for their metadata parent linkage, which provides the new fictional history in which you never made commit F, having cleverly started from their commit C instead.
To get this fictional history, we again clone and fetch commits so that we start with:
A--B--C--D--E <-- origin/main # from repo1
F--G--H--I--J <-- repo2/main
in a new repository. Then we use git replace --graft to make a new-and-improved replacement for G, which we'll call G" even though it's our first replacement. Git supports these grafts and other replacements on a single-repository-at-a-time basis (replacements are not normally copied during cloning) and most Git commands follow the replacing (unless you run git --no-replace-objects command), so now that we have:
A--B--C--D--E <-- origin/main # from repo1
\
G" <-- refs/replace/<hash-of-G>
:
F--G--H--I--J <-- repo2/main
in this repository, git log origin/main shows commit J, then I, then H, then G" (replacing G), then C, then B, and so on.
Now all we have to do is "cement the graft in place", using git filter-branch or the newfangled (and easier to use) git filter-repo. These commands follow replacements (unless run with --no-replace-objects anyway, and we won't do that) while re-copying all the commits. Using git filter-repo with its various constraints, which I have never actually done, or git filter-branch --tag-filter cat -- --all, and some small bits of follow-up work, we'll eventually get:
A--B--C--D--E <-- origin/main # from repo1
\
G'-H'-I'-J' <-- main
and things will look the way we intended. We discard the original commits—this "filtered" repository is a new repository and the original repository and its clones should all be archived and put away and never used again, ideally—and we use this as the new place from which to clone, and then go on our merry way with history looking like we knew what we were doing when we started.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
| Solution | Source |
|---|---|
| Solution 1 | torek |
