Pline Product Docs
  • 🚀Introduction
    • Key Features
    • Key Terminologies
  • Pline Browser Extension
  • Signing Up for Pline
    • Signing Up for Pline
  • Automated Data Extraction Mode
    • How to Build an Automation Workflow
    • How to Run an Automation Workflow
    • Inner Page Data Extraction
    • Limit Record Extraction
    • Wait Timer for Automation
    • Add to Existing Dataset
    • Multi Type Data Selector
  • Browse and Capture
    • How to Build a Browse & Capture Workflow
    • How to Run a Custom Browse and Capture Workflow
    • Multi-Tab Data Extraction
    • Add Alternate Selectors
  • Workflows
    • Custom Workflows
    • Prebuilt Workflows
    • Workflow Status
  • Pline Platform Navigation
    • Accessing Datasets
    • Editing Datasets
    • Filtering Datasets
    • Downloading Datasets
    • Tracking Workflows
      • Delete Workflow
      • View Workflow History
    • Updating Profile
    • Credit Usage
    • Field Name Recommendations
  • Team Collaboration
    • Inviting a New Team Member
    • Managing Team Members
    • Roles in Team Colloboration
  • Scheduling Workflows
    • Creating a New Schedule
    • Managing Scheduled Workflows
    • Viewing Scheduled Run Details
    • Workflow Schedule Status
    • Proof of Record
  • Release Notes
    • Pline v 1.10.12
    • Pline v 1.10.11
    • Pline v 1.10.10
    • Pline v 1.10.9
    • Pline v 1.10.8
    • Pline v 1.10.7
    • Pline v 1.10.6
    • Pline v 1.10.5
    • Pline v 1.10.4
    • Pline v 1.10.3
    • Pline v 1.10.2
  • Platform Domain Change & Extension Sync
Powered by GitBook
On this page
  • Step-by-Step Guide to Setting Up Inner Page Extraction
  • Step 1: Build an Automated Data Workflow
  • Step 2: Extract Data From Inner Pages
  • Step 3: Running the Workflow
  • Tips for Efficient Inner Page Extraction
  1. Automated Data Extraction Mode

Inner Page Data Extraction

Pline's inner page data extraction feature extracts and gathers detailed information from grouped listing pages at go.

PreviousHow to Run an Automation WorkflowNextLimit Record Extraction

Last updated 2 months ago

Inner Page Extraction builds on Pline's Automated Data Extraction feature, enabling you to collect data from the listing page and detailed pages simultaneously without having to manually open each detailed page.

Step-by-Step Guide to Setting Up Inner Page Extraction

This guide builds upon the steps in the How to Build an Automation Workflow section.

Step 1: Build an Automated Data Workflow

  • Launch the Pline extension and open the Automated Data Extraction mode.

  • Identify and group similar data fields you want to extract, such as product names or prices.

  • Select a "field" containing the link to the detailed or inner page and extract it as a "Link."

In our example, clicking on shoe titles provides an inner product details page, so they should be selected as links. Other field data types can be selected as required.

This is crucial to ensure successful navigation to the correct detailed page.

  • Select the appropriate pagination type for your data source (Click Next for our example)

Step 2: Extract Data From Inner Pages

  • Pline will automatically display all available links for extraction. Select the correct link, and Pline will navigate to the corresponding inner page.

  • Capture additional data from the inner pages:

    • Choose the field name (e.g., "Color", "Rating").

    • Set the Data Field Type (e.g., Text for attributes like color or rating).

    • Click Save after selecting each field.

  • The Pline extension panel gives an overview of your automated workflow. Click "View sample data" to preview the data fields to be extracted using the workflow.

Step 3: Running the Workflow

  • Add a workflow name and click "Save Workflow" to configure the workflow that will extract data from both the listing and detail pages simultaneously.

  • Then, click "Use workflow now".

To execute a workflow and store all extracted records, you will need to create a dataset.

  • Click "Create Dataset" to initiate the data extraction process.

Once the data extraction process is completed, you can access your dataset in the Pline Platform dashboard or download it as a CSV or JSON file for analysis.

Tips for Efficient Inner Page Extraction

Maximize the effectiveness of inner page extraction with these best practices :

  • Ensure the target website remains open until the data extraction process is complete.

  • Track status: "Records Collected" processed, not saved; "Records Saved" confirms secure storage.

  • Group similar data fields to optimize data selection.

  • Confirm the correct pagination type is selected to capture all necessary data.

  • Regularly monitor the target website for any structural changes.

Pline extension activated on Amazon for automated data extraction with two mode options shown.
Selecting “Link” as the data extraction type for product names on Amazon product listings.
hoosing "Next button" pagination option for structured page navigation during data scraping.
Workflow setup showing selected product details like reviews, ratings, color, and material.
Saving the created Pline workflow named "Amazon Shoes" with a preview of sample data.
Confirmation screen showing that the "Amazon Shoes" automation workflow is saved successfully.
Creating a dataset to store extracted Amazon product listings using the saved workflow.