Git for DS

Git Interview Q&A for Data Science

Version control essentials for notebooks, scripts, and collaborative ML projects.

1Why should Data Scientists learn Git?easy
Answer: Git tracks changes, enables collaboration, and preserves reproducible project history.
2What is a commit?easy
Answer: A commit is a snapshot of changes with a message describing intent.
3What is staging area?easy
Answer: It is an intermediate area where selected changes are prepared before committing.
4Branching purpose?easy
Answer: Branches isolate features/experiments so main branch remains stable.
5Merge vs rebase?medium
Answer: Merge combines histories; rebase rewrites commit base for linear history.
6What is merge conflict?medium
Answer: Conflict occurs when Git cannot automatically reconcile overlapping edits.
7How to ignore files?easy
Answer: Use .gitignore for datasets, temp files, secrets, and environment artifacts.
8What is remote repository?easy
Answer: A shared copy of repository hosted on services like GitHub.
9What does pull request do?easy
Answer: It proposes changes for review before merging into target branch.
10What is cherry-pick?medium
Answer: It applies a specific commit from one branch onto another.
11Why tags are useful?easy
Answer: Tags mark releases/checkpoints like model v1.0 for reliable rollback.
12How to inspect commit history?easy
Answer: Use git log, git show, and diff tools to trace evolution.
13What is detached HEAD?medium
Answer: It means HEAD points to a commit, not a branch; new commits can be orphaned.
14Git best practice for notebooks?hard
Answer: Clear outputs, keep small notebooks, pair with scripts, and review with notebook-aware diffs.
15One-line Git summary for DS interviews?easy
Answer: Git is the backbone of reproducible and collaborative Data Science development.