phpcpd-next: PHP 8.5+ Copy/Paste Detector CLI | Mohamed Said        [  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MH.png)   Mohamed Said Laravel Backend Engineer  ](https://msaied.com) [ Home ](https://msaied.com) [ Projects ](https://msaied.com/projects) [ Articles  ](https://msaied.com/articles) [ Certificates ](https://msaied.com/certificates) [ Contact ](https://msaied.com#contact-section) 

       [  ](https://github.com/EG-Mohamed)       

 [ Home ](https://msaied.com) [ Projects ](https://msaied.com/projects) [ Articles ](https://msaied.com/articles) [ Certificates ](https://msaied.com/certificates) [ Contact ](https://msaied.com#contact-section) 

  [ home ](https://msaied.com)    [ articles ](https://msaied.com/articles)    phpcpd-next: A Modern Copy/Paste Detector CLI for PHP 8.5+        On this page       1. [  phpcpd-next: A Modern Copy/Paste Detector for PHP 8.5+ ](#phpcpd-next-a-modern-copypaste-detector-for-php-85)
2. [  Three Detection Engines ](#three-detection-engines)
3. [  SARIF Output for GitHub Code Scanning ](#sarif-output-for-github-code-scanning)
4. [  Headless API and PHPUnit Assertions ](#headless-api-and-phpunit-assertions)
5. [  Incremental Caching for CI ](#incremental-caching-for-ci)
6. [  Laravel Preset and Installation ](#laravel-preset-and-installation)
7. [  Key Takeaways ](#key-takeaways)

  ![phpcpd-next: A Modern Copy/Paste Detector CLI for PHP 8.5+](https://cdn.msaied.com/354/6acc1ed3419cc5284564bf3c7862cb82.png)

 [  Composer Pacakge ](https://msaied.com/articles?category=composer-pacakge) [  PHP ](https://msaied.com/articles?category=php)  #PHP   #Code Quality   #CLI Tools   #Laravel   #Static Analysis   #PHPUnit  

 phpcpd-next: A Modern Copy/Paste Detector CLI for PHP 8.5+ 
============================================================

     2 Jul 2026      3 min read    ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said  

       Table of contents

1. [  01   phpcpd-next: A Modern Copy/Paste Detector for PHP 8.5+  ](#phpcpd-next-a-modern-copypaste-detector-for-php-85)
2. [  02   Three Detection Engines  ](#three-detection-engines)
3. [  03   SARIF Output for GitHub Code Scanning  ](#sarif-output-for-github-code-scanning)
4. [  04   Headless API and PHPUnit Assertions  ](#headless-api-and-phpunit-assertions)
5. [  05   Incremental Caching for CI  ](#incremental-caching-for-ci)
6. [  06   Laravel Preset and Installation  ](#laravel-preset-and-installation)
7. [  07   Key Takeaways  ](#key-takeaways)

 phpcpd-next: A Modern Copy/Paste Detector for PHP 8.5+
------------------------------------------------------

Code duplication is one of those problems that sneaks past review and compounds quietly over time. [phpcpd-next](https://github.com/phpcpd-next/phpcpd) is a CLI tool that scans your PHP codebase and reports duplicated blocks — including cases that a simple text diff would miss.

Maintained by Luciano Federico Pereira as a successor to Sebastian Bergmann's archived `phpcpd`, it keeps the same `phpcpd` command so existing scripts and CI pipelines need no changes.

Three Detection Engines
-----------------------

Most copy/paste detectors only catch word-for-word duplicates. phpcpd-next ships three engines:

- **Rabin-Karp** — exact contiguous matches, fast by default
- **TokenBag** — order-invariant overlap, catches shuffled statements
- **Suffix tree** — opt-in gapped Type-3 clones, where a statement was inserted or removed between otherwise identical blocks

Rabin-Karp and TokenBag run together on every default scan. The suffix-tree engine is opt-in:

```bash
# Default: exact + reordered detection
phpcpd src/

# Rabin-Karp only
phpcpd --rk src/

# Gapped clones via suffix tree
phpcpd --algorithm=suffixtree src/

```

Console output points at the duplicated ranges and suggests a refactor:

```sql
Found 2 code clones with 21 duplicated lines in 2 files:
  - app/Services/Billing.php:12-33 (21 lines)
    app/Services/Invoicing.php:40-61
    → Consider extracting the shared lines into a reusable method or constant.

```

SARIF Output for GitHub Code Scanning
-------------------------------------

phpcpd-next writes four output formats: console text, PMD-CPD XML, JSON, and SARIF 2.1.0. The SARIF format integrates directly with GitHub Code Scanning, surfacing clones in the Security tab. Diverged clones map to `warning` severity; exact clones map to `note`.

```yaml
- name: Detect duplicated code
  run: vendor/bin/phpcpd --log-sarif=phpcpd.sarif src/ || true
- name: Upload results
  uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: phpcpd.sarif

```

Headless API and PHPUnit Assertions
-----------------------------------

Detection can run in-process via a static `detect()` call — no subprocess, no report files:

```php
use LucianoPereira\PhpcpdNext\Phpcpd;

$clones = Phpcpd::detect(
    paths: 'app',
    minTokens: 60,
    algorithm: null, // Rabin-Karp + TokenBag
    preset: 'laravel',
);

```

A bundled PHPUnit trait turns duplication into a regression test:

```php
use LucianoPereira\PhpcpdNext\PHPUnit\AssertNoDuplication;
use PHPUnit\Framework\TestCase;

final class DuplicationTest extends TestCase
{
    use AssertNoDuplication;

    public function test_app_is_dry(): void
    {
        $this->assertNoDuplication(__DIR__ . '/../app', minTokens: 70);
    }
}

```

Incremental Caching for CI
--------------------------

For larger codebases, `--cache` stores results keyed by a configuration fingerprint and file-manifest hash. `--incremental` goes further, re-tokenizing only changed files and reusing the rest from a per-file index (Rabin-Karp only):

```yaml
- uses: actions/cache@v4
  with:
    path: .phpcpd-cache
    key: phpcpd-${{ hashFiles('**/*.php') }}
    restore-keys: phpcpd-
- run: vendor/bin/phpcpd --incremental --cache-dir .phpcpd-cache src/

```

Laravel Preset and Installation
-------------------------------

The tool requires PHP 8.5+, `ext-dom`, and `ext-mbstring`. Install it as a dev dependency:

```bash
composer require --dev phpcpd-next/phpcpd
vendor/bin/phpcpd src/

```

A built-in Laravel preset scans `app`, `routes`, `database`, and `config` while automatically excluding vendor code, Blade views, migrations, and IDE-helper files:

```bash
vendor/bin/phpcpd --preset=laravel app/Services --min-tokens=60

```

Key Takeaways
-------------

- Drop-in replacement for the archived `phpcpd` with the same CLI command
- Three engines: exact (Rabin-Karp), reordered (TokenBag), and gapped (suffix tree)
- SARIF 2.1.0 output integrates with GitHub Code Scanning out of the box
- PHPUnit trait makes duplication a first-class test assertion
- Incremental indexing keeps CI scans fast on large codebases
- Zero Composer runtime dependencies; requires PHP 8.5+
- Built-in Laravel preset with sensible exclusions

---

*Source: [Laravel News — A Copy/Paste Detector CLI for PHP 8.5+](https://laravel-news.com/a-copypaste-detector-cli-for-php-85)*

 Found this useful?

          [  ](https://twitter.com/intent/tweet?url=https%3A%2F%2Fmsaied.com%2Farticles%2Fphpcpd-next-a-modern-copypaste-detector-cli-for-php-85&text=phpcpd-next%3A+A+Modern+Copy%2FPaste+Detector+CLI+for+PHP+8.5%2B) [  ](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fmsaied.com%2Farticles%2Fphpcpd-next-a-modern-copypaste-detector-cli-for-php-85) 

 Frequently Asked Questions 
----------------------------

  3 questions  

     Q01  What is the difference between phpcpd-next and the original phpcpd?        phpcpd-next is a maintained successor to Sebastian Bergmann's archived phpcpd. It keeps the same `phpcpd` command as a drop-in replacement but adds two additional detection engines (TokenBag for reordered clones and a suffix-tree engine for gapped clones), four output formats including SARIF 2.1.0, a headless PHP API, a PHPUnit assertion trait, incremental CI caching, and framework presets including Laravel. 

      Q02  How do I integrate phpcpd-next with GitHub Code Scanning?        Run phpcpd-next with the `--log-sarif` flag to produce a SARIF 2.1.0 file, then upload it using the `github/codeql-action/upload-sarif@v3` action. Diverged clones appear as warnings and exact clones as notes in the GitHub Security tab. 

      Q03  Does phpcpd-next slow down CI on large codebases?        No. The `--cache` flag stores results keyed by a configuration fingerprint and file-manifest hash, replaying cached results when nothing has changed. The `--incremental` flag goes further by re-tokenizing only modified files and reusing the per-file index for everything else, printing a summary such as `(incremental index: 412 reused, 3 scanned)`. 

  Continue reading

 More Articles 
---------------

 [ View all    ](https://msaied.com/articles) 

 [ ![Cursor Pagination, Chunked Iteration, and Lazy Collections at Scale in Laravel](https://cdn.msaied.com/355/3a6df23a2c16b740843260134fad7c63.png) laravel eloquent performance 

### Cursor Pagination, Chunked Iteration, and Lazy Collections at Scale in Laravel

Offset pagination breaks under large datasets. Learn how cursor pagination, chunked iteration, and lazy collec...

  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said 

 3 Jul 2026     4 min read  

  Read    

 ](https://msaied.com/articles/cursor-pagination-chunked-iteration-and-lazy-collections-at-scale-in-laravel-1) [ ![Job Batching, Chaining, and Rate-Limited Middleware in Laravel Queues](https://cdn.msaied.com/353/89d47dc6b618d5435f9d7f333b75e922.png) laravel queues jobs 

### Job Batching, Chaining, and Rate-Limited Middleware in Laravel Queues

Go beyond basic dispatch: learn how to compose Laravel job batches with callbacks, chain dependent jobs safely...

  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said 

 3 Jul 2026     3 min read  

  Read    

 ](https://msaied.com/articles/job-batching-chaining-and-rate-limited-middleware-in-laravel-queues-2) [ ![Laravel Reverb: Building Presence Channels with Per-User State and Typed Events](https://cdn.msaied.com/352/9b3c490b8303fdc84442671965a3ee8a.png) laravel reverb websockets 

### Laravel Reverb: Building Presence Channels with Per-User State and Typed Events

Presence channels in Laravel Reverb go far beyond simple pub/sub. Learn how to track per-user state, broadcast...

  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MJ.jpg)  Mohamed Said 

 3 Jul 2026     3 min read  

  Read    

 ](https://msaied.com/articles/laravel-reverb-building-presence-channels-with-per-user-state-and-typed-events) 

   [  ![Mohamed Said](https://cdn.msaied.com/01KT78WE565VEMM3PSNQAAB0MH.png)   Mohamed Said Laravel Backend Engineer  ](https://msaied.com)Senior Backend Engineer specializing in Laravel, scalable SaaS platforms, APIs, and cloud infrastructure. I build secure, high-performance web applications that help businesses grow.

Explore

- [Home](https://msaied.com)
- [Projects](https://msaied.com/projects)
- [Articles](https://msaied.com/articles)
- [Certificates](https://msaied.com/certificates)
- [Contact](https://msaied.com#contact-section)

Connect

- [   hello@msaied.com ](mailto:hello@msaied.com)
- [   +20 109 461 9204 ](tel:+201094619204)

© 2026 Mohamed Said. All rights reserved.

 [  ](https://github.com/EG-Mohamed) [  ](https://www.linkedin.com/in/msaiedm/) [  ](https://wa.me/201094619204) [  ](mailto:hello@msaied.com) [  ](https://drive.google.com/file/u/0/d/1MF20IPRJyzfy32mhEutjL5EpSls0w2Q8/view)
