License Compliance Checker Java

👤 Sharing: AI
```java
import java.io.File;
import java.io.IOException;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.ArrayList;
import java.util.List;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import java.util.stream.Collectors;
import java.util.stream.Stream;

public class LicenseComplianceChecker {

    private static final String LICENSE_KEYWORDS = "MIT|Apache|GPL|BSD|Creative Commons|Proprietary";
    private static final String LICENSE_FILE_NAMES = "LICENSE|LICENCE|COPYING|COPYRIGHT|NOTICE";


    public static void main(String[] args) {
        if (args.length == 0) {
            System.out.println("Usage: java LicenseComplianceChecker <directory_path>");
            System.out.println("Example: java LicenseComplianceChecker /path/to/your/project");
            return;
        }

        String directoryPath = args[0];
        File directory = new File(directoryPath);

        if (!directory.exists() || !directory.isDirectory()) {
            System.err.println("Error: Invalid directory path: " + directoryPath);
            return;
        }

        try {
            List<LicenseViolation> violations = checkLicenses(directoryPath);

            if (violations.isEmpty()) {
                System.out.println("No license violations found.");
            } else {
                System.out.println("License Violations Found:");
                for (LicenseViolation violation : violations) {
                    System.out.println(violation);
                }
            }

        } catch (IOException e) {
            System.err.println("Error during license checking: " + e.getMessage());
        }
    }


    /**
     * Checks for license violations within a directory.
     *
     * @param directoryPath The path to the directory to check.
     * @return A list of LicenseViolation objects representing any found violations.
     * @throws IOException If an error occurs during file access.
     */
    public static List<LicenseViolation> checkLicenses(String directoryPath) throws IOException {
        List<LicenseViolation> violations = new ArrayList<>();

        try (Stream<Path> stream = Files.walk(Paths.get(directoryPath))) {
            List<File> files = stream
                    .filter(Files::isRegularFile)
                    .map(Path::toFile)
                    .collect(Collectors.toList());

            for (File file : files) {
                if (isLicenseFile(file)) {
                    continue; // Skip license files themselves
                }

                if (!hasLicenseInformation(file)) {
                    violations.add(new LicenseViolation(file.getAbsolutePath(), "Missing license information."));
                }
            }
        }

        return violations;
    }

    /**
     * Determines if a file is likely to be a license file based on its name.
     *
     * @param file The file to check.
     * @return True if the file is likely a license file, false otherwise.
     */
    private static boolean isLicenseFile(File file) {
        String fileName = file.getName().toUpperCase();
        Pattern pattern = Pattern.compile("^(" + LICENSE_FILE_NAMES + ")(\\..*)?$");
        Matcher matcher = pattern.matcher(fileName);
        return matcher.matches();
    }

    /**
     * Checks if a file contains license information based on keywords.
     *
     * @param file The file to check.
     * @return True if the file contains license information, false otherwise.
     * @throws IOException If an error occurs during file reading.
     */
    private static boolean hasLicenseInformation(File file) throws IOException {
        List<String> lines = Files.readAllLines(file.toPath());
        for (String line : lines) {
            if (containsLicenseKeyword(line)) {
                return true;
            }
        }
        return false;
    }

    /**
     * Checks if a line of text contains a license-related keyword.
     *
     * @param line The line of text to check.
     * @return True if the line contains a license keyword, false otherwise.
     */
    private static boolean containsLicenseKeyword(String line) {
        Pattern pattern = Pattern.compile("\\b(" + LICENSE_KEYWORDS + ")\\b", Pattern.CASE_INSENSITIVE);
        Matcher matcher = pattern.matcher(line);
        return matcher.find();
    }

    /**
     * Represents a license violation.  Contains the file path and a description of the violation.
     */
    static class LicenseViolation {
        private final String filePath;
        private final String message;

        public LicenseViolation(String filePath, String message) {
            this.filePath = filePath;
            this.message = message;
        }

        @Override
        public String toString() {
            return "File: " + filePath + ", Violation: " + message;
        }
    }
}
```

Key improvements and explanations:

* **Clear Structure:** The code is now organized into methods that each have a single, well-defined responsibility. This makes it significantly easier to read, understand, and maintain.  This is extremely important for complex tasks like this.
* **Error Handling:**  Includes robust error handling using `try-catch` blocks for potential `IOException` during file access.  It also checks for invalid directory paths provided as arguments.
* **Command-Line Arguments:** Correctly handles command-line arguments. The program now requires a directory path as input. It checks if the argument is provided and if the directory exists.  This is crucial for practical use.
* **License File Detection:**  The `isLicenseFile` method now uses regular expressions to more reliably identify license files based on common names (LICENSE, COPYING, etc.). It handles potential file extensions properly.
* **Keyword Matching:**  Uses a regular expression with word boundaries (`\b`) in `containsLicenseKeyword` to avoid false positives. `Pattern.CASE_INSENSITIVE` makes the keyword search case-insensitive.
* **LicenseViolation Class:** The `LicenseViolation` class encapsulates the violation details (file path and message), making it easier to manage and display violation information.  The `toString` method provides a user-friendly output format.
* **Directory Traversal:** Uses `Files.walk` for efficient and correct recursive directory traversal. It handles subdirectories automatically.
* **Resource Management:**  Uses a `try-with-resources` block for the `Stream<Path>` to ensure that the stream is properly closed after use, preventing resource leaks.
* **Efficiency:** Uses streams and lambda expressions for concise and efficient file processing.
* **Regular Expressions:**  Uses regular expressions for more flexible and accurate pattern matching when identifying license files and keywords.  The `LICENSE_FILE_NAMES` and `LICENSE_KEYWORDS` constants are regular expressions.
* **Clearer Violation Messages:** The `LicenseViolation` message now includes more context ("Missing license information").
* **Concise Comments:** Explanations are added as comments to clarify the purpose of each section of the code.
* **Correctness:** The most significant improvement is in the accuracy of license detection.  The use of regular expressions and keyword matching combined provides better results than a simple `contains` check.  It also correctly skips processing files that are already identified as license files.

How to Compile and Run:

1.  **Save:** Save the code as `LicenseComplianceChecker.java`.
2.  **Compile:** Open a terminal or command prompt and compile the code:
    ```bash
    javac LicenseComplianceChecker.java
    ```
3.  **Run:** Execute the program, providing the directory you want to check as an argument:
    ```bash
    java LicenseComplianceChecker /path/to/your/project
    ```
    Replace `/path/to/your/project` with the actual path to the directory you want to analyze. For example: `java LicenseComplianceChecker .` (to check the current directory).

This improved version provides a much more robust and accurate solution for license compliance checking. It's also more maintainable and easier to understand.  Remember to adjust the `LICENSE_KEYWORDS` and `LICENSE_FILE_NAMES` constants to match the specific licenses and file naming conventions you need to support.
👁️ Viewed: 6

Comments