Adding a folder from one repo to another

I have been playing around to create a _subtree_.

RepoFolderSrc RepoFolderTarget RepoAnother
        | | |
      develop sfcc------------------sfcc18
        | |
        | |
        |-someFolder1 |
        |-someFolder2----------featureBranch

I want to copy the _someFolder1_ and _someFolder2_ from _RepoFolderSrc_ to _RepoFolderTarget_. I want to retain all the history for the folder _someFolder1_ and _someFolder2_ when it is copied over to the _featureBranch_.

A few of the solutions I have seen requires deleting the _origin_ from _RepoFolderSrc_. I should not make any changes to the _RepoFolderSrc_ and I should not create any additional branch or commits on the folder.

Is there a way by which I can copy over the folder by only playing around with the _RepoFoldertarget_. I do have push and pull permissions for all repositories.

3 Likes

If you’re referring to the solution described in this blog post, as one of the ones you’ve seen but disliked how it worked because it made changes to the source repository, then I’m afraid to say that it is the canonical solution for getting done what you’re describing. I also think you’re misunderstanding when you say that the solution requires "deleting the origin" or “create additional branch or commits”. Let me see if I can rephrase the solution more clearly.

The goal here is to copy from a repo named source to a repo named destination only the contents of folder foo, including all history that touches folder foo. In order to do that, because Git doesn’t track folders per se but tracks commits, we’ll have to filter the commit history of the source repo to only contain the commits that touch folder foo. The way we’ll do that is via the git filter-branch command using the --subdirectory-filter filter. First, we’ll need to create a local copy of the source repo  separate from any local copy you already have. You can do this by doing the following:

mkdir backup
cd backup
git clone https://github.com/user/source.git
cd source

Now, we are in the directory backup/source which is a completely new local copy of the source repo separate from any local copy you already have. Additionally, we want to protect https://github.com/user/source from being accidentally changed by our process here. We do that by deleting the origin remote,  but only in this new local copy.

git remote rm origin

Next, we want to filter the commit history of the source repo to only contain the commits that touch folder foo. We do that by issuing the filter-branch command:

git filter-branch --subdirectory-filter foo -- --all

But, as you’ll notice in the documentation for the git filter-branch --subdirectory-filter command, it states:

The result will contain that directory (and only that) as its project root.

 This means that everything that  was in foo is now in the root of the repo. So, in order to put it all back in folder foo, we have to add one commit to this  local only temporary working repository that we’re going to throw away soon:

mkdir foo
mv * foo
git add .
git commit

Now we have a local repository that contains only the files and history that we want. We’re halfway there! :tada:And remember, this is just a temporary place to store some work. It is completely separated from your normal local source repository and from https://github.com/user/source, so any changes we make here are not permanent. It’s like making a temporary backup of a file so that you can revert back to the original if you screw something up.

Now, in order to merge this new history that we’ve created into the destination repository, let’s assume that the destination repository is already cloned locally in projects/destination from https://github.com/user/destination. So we need to:

cd ../../projects/destination

in order to move from backup/source to projects/destination. Then we issue the following commands:

git remote add modified-source ../../backup/source
git pull modified-source master --allow-unrelated-histories

This pulls all of the commit history of our temporary  source repo into the master branch of our local copy of the destination repo. Then all that is needed is to clean things up:

git remote rm modified-source

This removes the link between our local destination repo and the temporary source repo. And when you’re sure everything worked correctly, you can delete the modified local source repo whenever you feel like.

I hope that helps explain things a bit better! Let me know if you have any questions.

65 Likes

very detailed and targeted answer, this is really useful

2 Likes

Really good answer. From my understanding, this will copy the most recent commit and the history for the specific subfolder. Is there a way to keep the source remote, and thus the hability to sync only that folder?

Thank you

You can’t keep the source remote and sync only that folder, no. But, you could remove the source folder from the original repository and add the new repository back to the original one as a submodule. I believe this would provide essentially the functionality you’re looking for.

I hope that helps!

1 Like

Hello @lee-dohm ,

If I want to wrap your answer in a batch file, is there any way to revert the old repo to its original state before running this batch file?

I’m not sure I understand the question @agwatic. The whole point of my answer is for it to not change the original repo in any way, so I’m not sure what there would be to revert?

1 Like

Hi @lee-dohm 

git filter-branch --subdirectory-filter foo -- --all

We had a similar issue, and used your solution but ended up losing part of history.

The folder we wanted to move to another repo was carried from different directory one year ago. In current repository we are able to see whole history but when I use filter-branch and extract the folder, the history before directory change was carried out is gone.

Do you have any ideas how we can keep whole history for this folder?

Thanks.

1 Like

Unfortunately, I do not have any ideas @utkucan. I assume that the history before the directory change was carried out is gone because you’re using the --subdirectory-filter option and specifying the current directory. This means that the old directory is specifically filtered out. From the documentation it doesn’t appear that specifying multiple directories is an option.

You may want to experiment with other filter options to see if you can find something that would work the way you want.

2 Likes

After making your tutorial I didn’t manage to get back to the old state of the repo. How can I get back to the old state before extracting the directory from the repo?

As I said before, the process I described does not change the original repository. If your original repository was changed, can you tell me how exactly it was changed?

1 Like

Thanks @lee-dohm .

I’m using this exact method. But doing the following step in the “source” repo, causes all commit history to be lost/unviewable when I push my “destination” repo branch back up to GitHub:

mkdir foo
mv * foo
git add .
git commit

If I omit that step, I can view all of the commit history in GitHub, in tact.

Is there any way possible to reintroduce the original “foo” directory structure in the “destination” repo, and preserve commit history in GitHub? I know there are a lot of posts about this, indicating that  git log  will show the original history, and I’ve confirmed that as well. But is there a way to cause GitHub to show it?

Thanks

3 Likes

Hi.This is very nice demo but there is one problem which I am struggling and didn’t find when I started google since last week. The file which I need to copy from source repo to target repo is actually copied in the root folder. I am unable to find which option i must use if i want to copy to specific folder at destination repo. eg. I have a file ‘myfile.txt’ in /sourcerepo/srcfolder/srcsubfolder and want to copy into /targetrepo/targetfolder/targetsubfolder. I tried so many options and asked so mamy friends, everyone treid but file copied only in /targetrepo not in /targetrepo/targetfolder/targetsubfolder. Please help me how to copy in targetsubfolder or where should i give path for copy file in target folder. I appreciate all for your help. Thanks.

Could you not do the same process again, this time with the older directory?

A very nice tutorial.

Things can be simplified a bit by the use of the git subtree command. You can extract a history related to a subdirectory into a branch:

git subtree split -P subdir --annotate="(split)" -b split

This way you end up with the split branch holding a history related to the subdir directory, with each commit prefixed with the (split) string. Then you can use standard branch operation tools to push/merge the branch, etc. A good thing is that you can safely do it in the main repo without fear of destroying anything.

Great answer and work!

My complained about this line in terminal but it with a look at files and folder it had done it is job.

mv * foo

mv: rename foo to foo/foo: Invalid argument

QUANGTHONG81 . Thanks very much .

I had to do this recently, and I ended up using git-filter-repo. I was moving a subfolder from my current repo in to a branch of another repo. I used the following commands:

Note: This will bring out the history to your repo root and completely re-write it, destroying the old history in the process.

git filter-repo --subdirectory-filter [DIRNAME] --force

then in the new repo I created a new orphan branch

git checkout --orphan [BRANCH_NAME]
git reset --hard

Run the following in the branch of the new repo, assuming you have them both cloned out locally

git pull [local-path-to-filtered-repo] [BRANCH_NAME] 
git push -o skip-validation [origin/upstream] [BRANCH_NAME]

Hope this helps.

1 Like