A comprehensive journalism portfolio website featuring 20+ years of entertainment journalism, reviews, and interviews. The site includes over 4,200 articles from publications like The New York Times, Los Angeles Times, Huffington Post, Entertainment Weekly, and many more.
- Extensive Article Archive: 4,200+ PDFs organized by publication
- Full-Text Search: Search across all articles with extracted text content
- Admin Panel: One-click updates for new content
- SEO Optimized: Automatically generated HTML versions of all PDFs
- Responsive Design: Works on all devices
- Social Integration: BlueSky, Facebook, and Substack newsletter
Follow these steps to add new articles to your website:
-
Save articles as PDFs
- For each article, save/download as PDF from the publication's website
- Use this filename format:
Publication-Title-MM-DD-YYYY.pdf - Example:
HuffPo-Theater_Review_Hamilton-03-15-2025.pdf - Place all PDFs in the "PDFs to Compress" folder on your Desktop
-
Compress all PDFs
- Double-click "Compress PDFs on Desktop.command"
- Wait for "Processing complete!" message
- All PDFs will be compressed at once
-
Move compressed PDFs to scans
- Open the "Compressed PDFs" folder on your Desktop
- Select ALL files in this folder
- Move them to:
/Users/spencejb/Documents/GiltzWeb 2/scans/ - The "Compressed PDFs" folder should now be empty
-
Upload to website
- Open your FTP client
- Connect to
ftp.michaelgiltz.com - Navigate to the
scans/folder - Upload all the new PDFs you just moved
-
Update the website
- Go to
http://michaelgiltz.com/admin - Click the "Process New PDFs" button
- The system will process all new articles at once
- Wait for "Website updated successfully!" message
- Go to
Done! All your new articles are now live with embedded PDFs and searchable text.
To save online articles as compressed PDFs:
- Open Terminal
- Navigate to project folder:
cd "/Users/spencejb/Documents/GiltzWeb 2" - Run archiver:
python3 archive_article_simple.py - Enter the article URL
- Enter filename in format:
Publication-Article_Title-MM-DD-YYYY - Find compressed PDF on your Desktop
Example: Variety-Oscar_Nominations_2024-01-23-2024.pdf
If you need to compress PDFs separately:
- Put PDFs in the "PDFs to Compress" folder on Desktop
- Double-click "Compress PDFs on Desktop.command"
- Find compressed PDFs in "Compressed PDFs" folder
Features:
- 72 DPI compression (~90% size reduction)
- Optimized for fast web viewing with linearization
- Originals automatically moved to "Original PDFs" folder
- Processes all PDFs in the folder at once
# Clone the repository
git clone https://github.com/jessespencersmith/michaelgiltz.git
cd michaelgiltz
# Install Python dependencies (if any)
pip install -r requirements.txt
# Run local test
python3 scripts/test_local.py.
βββ articles/ # Combined HTML pages (4,200+ files) - NEW!
βββ scans/ # PDF articles (4,200+ files)
βββ admin/ # Admin panel
β βββ index.php # Admin interface (updated)
βββ scripts/ # Processing scripts
β βββ create_combined_pages.py # Creates HTML+PDF pages
β βββ process_new_pdfs.py # Processes new uploads
β βββ deploy.py # Deployment script
βββ *.htm # Main HTML pages (all updated)
βββ search.php # Search functionality (updated)
βββ giltz.css # Stylesheet
βββ compress_pdf_web.py # PDF compression tool
βββ archive_article_simple.py # Web article archiver
The site now uses a combined HTML+PDF system:
- Filename Format:
Publication-Title_of_Article-MM-DD-YYYY.pdf - Combined Pages: Each PDF gets an HTML page with:
- Embedded PDF viewer at the top
- Extracted text content below for SEO
- Site navigation and branding
- Automatic Updates: Publication pages are updated with new article links
- Search Integration: Full text search across all articles
Located at /admin, the admin panel provides:
- One-click processing of new PDFs
- Shows unprocessed PDF count
- Recent PDF listing with status
- Processing statistics
- Test mode for verification
- Automatic publication page updates
- Full-text search across all articles
- Searches extracted text content in HTML pages
- Returns excerpts with highlighted search terms
- Links directly to combined HTML pages
- Much faster and more accurate than before
host = "ftp.michaelgiltz.com"
username = "[email protected]"
remote_dir = "/"The admin panel is protected with HTTP Basic Authentication. To set up:
- Use an online htpasswd generator
- Create
.htpasswdfile in/admin - Update
.htaccesswith correct path
- Format:
Publication-Title_of_Article-MM-DD-YYYY.pdf - Publication: No spaces (use HuffPo, not Huffington Post)
- Title: Use underscores for spaces
- Date: MM-DD-YYYY format
- HuffPo (Huffington Post)
- BookFilter
- Popsurfing
- NYPost (New York Post)
- LATimes (Los Angeles Times)
- And 30+ more...
python3 upload_site.pyThis uploads:
- All HTML pages with updated links
- 4,213 combined article pages
- Updated search functionality
- Admin panel updates
python3 scripts/create_combined_pages.pypython3 deploy_html.py- Admin panel password protected
- Sensitive files excluded from repository
- FTP credentials stored locally only
.htaccessprotects sensitive directories
- Check filename format matches convention
- Ensure no special characters in filename
- Verify PDF is valid and not corrupted
- Verify
.htpasswdexists in admin directory - Check
.htaccesspath is correct - Ensure password was generated with Apache MD5
- Verify
extracted_content/directory exists - Check PHP is enabled on server
- Run full update to regenerate HTML files
- Combined HTML+PDF pages for all articles
- Improved search with full text indexing
- PDF compression workflow
- Web article archiving tool
- Automatic publication page updates
- Enhanced admin panel
- Automatic social media posting
- Analytics integration
- Comment system
- Related articles feature
- Mobile app
- RSS feed generation
python3 scripts/test_local.py- Create feature branch
- Test locally
- Update documentation
- Submit pull request
Β© 2025 Michael J. Giltz. All rights reserved.
Articles and content are the sole property of Michael Giltz and the original publishers. Code and infrastructure may be used with attribution.
While this is primarily a personal portfolio site, bug reports and feature suggestions are welcome:
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature) - Commit your changes (
git commit -m 'Add some AmazingFeature') - Push to the branch (
git push origin feature/AmazingFeature) - Open a Pull Request
Michael Giltz
- Email: [email protected]
- BlueSky: @mgiltz.bsky.social
- Facebook: michael.giltz
- Newsletter: Subscribe on Substack
- Built with Python, PHP, and classic web technologies
- Hosted on BlueHost
- PDF processing powered by PyPDF2
- Over 20 years of journalism archived and searchable
Note: This repository contains the website infrastructure. The actual article PDFs are not included due to size and copyright considerations.