How to use Forensic Analyzer
- Drop a PDF into the analyzer. Scanning takes 1-5 seconds depending on file size and embedded image count.
- Read the risk badge: Critical (GPS, JavaScript), High (author PII, edit history), Medium (software fingerprints), Low (generic metadata).
- Expand each finding category to see raw field values. Author, Creator, Producer, and ModDate are the most common identity leaks.
- Check the EXIF section — embedded images may contain GPS coordinates, camera serial numbers, and timestamps independent of the PDF metadata.
- Use the direct links to the scrubber, flatten, or privacy pipeline to neutralize what was found. Run the analyzer again on the cleaned output to verify.
Tips
- The most common leak: Author field containing your Windows username or company email. This is set by Microsoft Office on every PDF export.
- GPS coordinates in embedded JPEG images are the highest-severity finding. A single photo with EXIF GPS can reveal your exact location.
- JavaScript in PDFs can phone home when opened. The analyzer flags this as critical because it enables tracking on document open.
- Run analyze → scrub → analyze as a verification loop. The second analysis should show zero or near-zero findings.
- Software fingerprints (Creator: Microsoft Word, Producer: LibreOffice) reveal what tools you used. This matters for source protection but is low-severity for most users.
- If quota is reached, wait for month reset or upgrade for unlimited usage.
What this does not protect
- Analysis is read-only. It does not modify the PDF in any way — use the scrubber for that.
- Encrypted or malformed PDFs fall back to byte-level pattern matching. Coverage is reduced but EXIF markers and JavaScript signatures are still detected.
- Printer tracking dots (Machine Identification Codes) are not detected in the current analyzer. Use the MIC decoder research tool for that — it requires high-resolution page rendering.
- The analyzer checks PDF metadata and embedded image EXIF. It does not analyze text content for PII (names, addresses in the visible text).
- It does not replace legal, compliance, or incident-response workflows.