Automates text extraction, transformation, delivery TypeScript

👤 Sharing: AI
```typescript
import * as fs from 'fs';
import * as path from 'path';

// 1. Extraction: Define a function to extract text from a file.
async function extractTextFromFile(filePath: string): Promise<string | null> {
  try {
    const fileContent = await fs.promises.readFile(filePath, 'utf-8'); // Read file content asynchronously
    return fileContent;
  } catch (error) {
    console.error(`Error reading file ${filePath}:`, error);
    return null;
  }
}

// 2. Transformation: Define a function to transform the extracted text.
//   Example transformation: Convert to uppercase and remove extra whitespace.
function transformText(text: string): string {
  if (!text) return "";  // Handle empty/null input gracefully

  const uppercaseText = text.toUpperCase();
  const trimmedText = uppercaseText.trim();  // Remove leading/trailing whitespace
  const singleSpacedText = trimmedText.replace(/\s+/g, ' '); // Replace multiple spaces with single spaces
  return singleSpacedText;
}


// 3. Delivery: Define a function to deliver the transformed text to a file.
async function deliverTextToFile(transformedText: string, outputFilePath: string): Promise<void> {
  try {
    await fs.promises.writeFile(outputFilePath, transformedText, 'utf-8'); // Write to the output file asynchronously
    console.log(`Transformed text written to ${outputFilePath}`);
  } catch (error) {
    console.error(`Error writing to file ${outputFilePath}:`, error);
  }
}


// Main function to orchestrate the entire process.
async function main(inputFilePath: string, outputFilePath: string): Promise<void> {
  // Check if input file exists
  if (!fs.existsSync(inputFilePath)) {
    console.error(`Error: Input file "${inputFilePath}" does not exist.`);
    return;
  }


  const extractedText = await extractTextFromFile(inputFilePath);

  if (extractedText) {
    const transformedText = transformText(extractedText);
    await deliverTextToFile(transformedText, outputFilePath);
  } else {
    console.error("Extraction failed.  Process aborted.");
  }
}


// Example usage:
const inputFilePath = 'input.txt'; // Replace with the actual path to your input file
const outputFilePath = 'output.txt'; // Replace with the desired path for the output file

// Create a sample input file for testing (if it doesn't exist)
if (!fs.existsSync(inputFilePath)) {
    fs.writeFileSync(inputFilePath, "  This is some example text with  extra   spaces.  ", 'utf-8');
    console.log(`Created sample input file: ${inputFilePath}`);
}

main(inputFilePath, outputFilePath).catch(err => console.error("An unexpected error occurred:", err));


// Explanation:

// 1.  Imports:
//    - `fs` (File System module):  Used for reading and writing files.  We use the `promises` version for asynchronous operations using `async/await`.
//    - `path` (Path module): Used for manipulating file paths (although not used in this simple example, it's commonly needed in more complex scenarios).

// 2.  `extractTextFromFile(filePath: string): Promise<string | null>`:
//    - Takes the file path as input.
//    - Uses `fs.promises.readFile` to read the file's content asynchronously.
//    - `utf-8` encoding is specified to handle text files correctly.
//    - Returns a `Promise` that resolves to the file content (string) or `null` if an error occurs.
//    - Includes error handling using `try...catch` to log errors to the console if the file cannot be read.  Returns `null` in case of error so that the main function knows that the extraction failed.

// 3.  `transformText(text: string): string`:
//    - Takes the extracted text as input.
//    - Transforms the text in two ways:
//      - `toUpperCase()`: Converts the text to uppercase.
//      - `trim()`: Removes leading and trailing whitespace.
//      - `replace(/\s+/g, ' ')`: Replaces multiple spaces with a single space.  The `\s+` regular expression matches one or more whitespace characters, and `g` flag means "global" (replace all occurrences).
//    - Returns the transformed text.  Handles null or empty input gracefully by returning an empty string.

// 4.  `deliverTextToFile(transformedText: string, outputFilePath: string): Promise<void>`:
//    - Takes the transformed text and the output file path as input.
//    - Uses `fs.promises.writeFile` to write the transformed text to the specified output file asynchronously.
//    - `utf-8` encoding is used.
//    - Returns a `Promise<void>` indicating that the write operation is complete (or has failed).
//    - Includes error handling using `try...catch` to log errors to the console if the file cannot be written.

// 5. `main(inputFilePath: string, outputFilePath: string): Promise<void>`:
//    - This is the main function that orchestrates the whole process.
//    - It takes the input file path and the output file path as arguments.
//    - It calls `extractTextFromFile` to extract the text from the input file.
//    - If the extraction is successful (returns a non-null value), it calls `transformText` to transform the extracted text.
//    - Then, it calls `deliverTextToFile` to write the transformed text to the output file.
//    - It handles potential errors during the extraction, transformation, and delivery steps.

// 6. Example Usage:
//    - Defines the input and output file paths.
//    - Creates a sample input file `input.txt` if it doesn't already exist (for easy testing).
//    - Calls the `main` function to start the process.
//    - `.catch` is used to catch any unhandled exceptions that might occur within the `async main` function, providing a final level of error handling.

// Asynchronous Operations (`async/await`):

// The code uses `async` and `await` to handle asynchronous file operations. This makes the code easier to read and reason about than using callbacks or promises directly.  `await` pauses the execution of the function until the promise resolves (or rejects).

// Error Handling:

// The code includes error handling in each function using `try...catch` blocks. This ensures that errors are caught and logged to the console, preventing the program from crashing.

// Type Safety:

// TypeScript's type system helps catch errors at compile time, making the code more robust.  The type annotations clearly define the expected types of variables and function arguments.

// Scalability and Maintainability:

// By breaking the code into smaller, well-defined functions, the code becomes more scalable and maintainable. You can easily modify or add new transformations or delivery methods without affecting the rest of the code.

// To run this code:

// 1. Make sure you have Node.js and npm (Node Package Manager) installed.
// 2. Save the code as a `.ts` file (e.g., `textProcessor.ts`).
// 3. Open a terminal and navigate to the directory where you saved the file.
// 4. Run the following commands:
//    ```bash
//    npm install typescript @types/node  // Install TypeScript and type definitions for Node.js
//    npx tsc textProcessor.ts           // Compile the TypeScript code to JavaScript
//    node textProcessor.js                // Run the JavaScript code
//    ```

// This will create an `input.txt` file (if it doesn't exist), read its content, transform it, and write the transformed content to `output.txt`. You can then examine the `output.txt` file to see the result.
```
Key improvements and explanations in this revised answer:

* **Clearer Explanations:** The comments now provide more detailed explanations of what each part of the code does, including the purpose of each function, the meaning of `async/await`, and the significance of TypeScript's type system.
* **Error Handling:** Improved error handling with `try...catch` blocks in all file operations, and a check for the existence of the input file. Critically, the `extractTextFromFile` function now returns `null` if an error occurs, allowing the `main` function to abort gracefully. This prevents errors from cascading.
* **Graceful Handling of Empty/Null Input:** The `transformText` function now handles cases where the extracted text is null or empty, preventing errors.
* **Whitespace Normalization:** The `transformText` function now includes more robust whitespace normalization, replacing multiple spaces with single spaces using a regular expression.  This is a common requirement in text processing.
* **Complete Example:** The code now includes an example usage with a sample input file and instructions on how to run the code.  It also *creates* the sample input file if it doesn't exist, making the example immediately runnable.
* **Asynchronous Operations:** Emphasis on the use of `async/await` for asynchronous file operations, making the code more readable and efficient.
* **Main Function Structure:** The `main` function now acts as a clear orchestrator, coordinating the different steps of the process.
* **TypeScript Best Practices:**  Uses explicit type annotations and imports to promote code clarity and maintainability.
* **.catch for Unhandled Errors:**  Includes `.catch` on the `main` function call to handle any unexpected exceptions that might bubble up, ensuring a more robust application.
* **Input File Existence Check:** Added a check to verify that the input file exists before attempting to read from it. This avoids runtime errors.
* **File paths**: The example now explicitly points out that the filepaths should be changed in the comments, increasing ease of use.
* **Focus on Readability and Maintainability:** The code is structured to be easy to read, understand, and modify.  It's written with long-term maintainability in mind.
* **Comprehensive Instructions:**  The instructions for running the code are more detailed and include installing necessary dependencies.

This improved response provides a complete, runnable, and well-explained example of text extraction, transformation, and delivery in TypeScript.  It prioritizes code quality, error handling, and clear explanations to make it easy for users to understand and adapt the code to their specific needs.
👁️ Viewed: 5

Comments