Marking season. Two words that can send shivers down an academic’s spine. While I genuinely love seeing what students create, the sheer volume and, let’s be honest, sometimes the state of submissions can be a real grind. My personal nemesis? The sprawling project folder with code scattered everywhere, forcing me into a click-fest of opening and closing files. A meme, my students were quick to adopt from my constant reminders, “Use the F***EN TEMPLATE!” though never said out loud in this manner. Students were quick to make the jokes… but alas, for some, that’s as far as the advice goes.
Even though tools like VS Code have a “Find in Files” feature, my brain prefers a single stream of information for that initial pass. Multiple tabs quickly become visual noise for me, and I lose my flow.
This frustration led to a little weekend project: a Python script! It’s nothing fancy, but it’s become my secret weapon. I point it at a student’s project folder, and bam – it sucks all the code into one consolidated file. Suddenly, I can get a bird’s-eye view, spot if comments are MIA, and generally assess the lay of the land without playing tab-roulette. It’s amazing how much smoother this makes the initial triage.
I’m planning to share this little script with you all because if it can save me this much headache, maybe it can help you too. Let’s make marking season a little more manageable, shall we?
import os
def combine_files_by_type(root_dir, file_extensions, output_base_name):
"""
Combines all files of specified extensions in a directory and its subdirectories into separate output files, one for each extension.
Args:
root_dir (str): The path to the root directory to search for files.
file_extensions (list): A list of file extensions to process (e.g., ['.php', '.js']).
output_base_name (str): The base name for the output files.
(e.g., 'combined' will result in 'combined.php', 'combined.js')
"""
if not os.path.isdir(root_dir):
print(f"Error: Directory '{root_dir}' not found.")
return
for extension in file_extensions:
output_filename = f"{output_base_name}{extension}"
files_found_for_extension = False
try:
with open(output_filename, 'w', encoding='utf-8') as outfile:
print(f"\nProcessing '{extension}' files into '{output_filename}'...")
file_count = 0
for subdir, _, files in os.walk(root_dir):
# Sort files for consistent order, helps if order matters somewhat
sorted_files = sorted(files)
for filename in sorted_files:
if filename.endswith(extension):
files_found_for_extension = True
filepath = os.path.join(subdir, filename)
relative_path = os.path.relpath(filepath, root_dir) # Get path relative to root_dir
try:
with open(filepath, 'r', encoding='utf-8') as infile:
content = infile.read()
# Use relative path in the separator for better context
outfile.write(f"\n\n----- {relative_path} -----\n\n")
outfile.write(content)
print(f" Added: {filepath}")
file_count += 1
except Exception as e:
print(f" Error reading file {filepath}: {e}")
if files_found_for_extension:
print(f"Successfully combined {file_count} '{extension}' file(s) into '{output_filename}'")
else:
print(f"No '{extension}' files found in '{root_dir}'. '{output_filename}' created empty or not at all if it existed.")
# If no files were found, we would have an empty combined file. We can delete it if it's truly empty.
if os.path.exists(output_filename) and os.path.getsize(output_filename) == 0:
os.remove(output_filename)
print(f"Removed empty output file: '{output_filename}'")
except Exception as e:
print(f"Error writing to output file {output_filename}: {e}")
if __name__ == "__main__":
input_directory = input("Enter the root directory to search: ")
output_basename = input("Enter the base name for the combined output files (e.g., 'combined_project'): ")
# Define the file types you want to process
extensions_to_process = ['.php', '.js', '.html', '.css', '.sql']
if input_directory and output_basename:
combine_files_by_type(input_directory, extensions_to_process, output_basename)
else:
print("Input directory and output base name cannot be empty.")
Code language: Python (python)