Arria
For more general information on data collection in Redbird, check out: Getting Started With Data Collection
This guide explains how to set up access and connect to the Arria Natural Langauge Generation Platform. With this dual-purpose application, you can:
- Feed a dataset (including those generated from other collect apps) into Arria to be processed by the platform.
- Extract the output from Arria into Redbird to immediately download as a CSV or use downstream in a workflow e.g. to apply further transformation steps or visualize the data in a dashboard.
If you cannot see the Arria collection app in the left-side panel on the workflow canvas, refer to: Enabling Collection Apps Guide
Inputting Credentials
- Click the person icon located in the upper right-hand corner of the page
- Click Account Settings
- Click Apps
- Navigate to Arria
- Click on the settings cog associated with Arria
- Input the credentials (note you can input multiple API and Project Keys here. Give each one a project name so you can easily reference it in the Arria collect app).
- Click Done
- Click Save
Creating a Collection
- Prior to creating a collection ensure you:
- Use the collect App associated with the project of choice (e.g. Department of Labor) to collect your data
- Generate a dataset (or refresh an existing dataset) with the data from the collection (note - all fields should be classified as "Text" data type)
- Have dropped the Arria node onto the canvas
- Double-click the Arria node
- Click the pencil next to Configuration Name to name your collection
- Click manage projects to enter or update the Arria API credentials, if required
- Choose the dataset you want to feed into Arria from the Select a Dataset dropdown
- Choose the project you want to reference in Arria (linked to your API credentials) from the Select a Project dropdown
- The Update Method allows you to select how you want your data to aggregate when you run future data pulls. Append keeps the historical data and adds the future data pulls below. Replace deletes the historical data and generates the new data in its place.
- Initial Data Load allows you to upload historical data in bulk as a one-off in the event that you have the data saved on your computer. You can upload your data as a CSV file with no leading/trailing rows or columns and Redbird will run future data collections using the configuration that you set up in previous steps.
- Click Done
Running a Collection
- Click on the node.
- In the right-side panel click Run
- Once the configuration is run, click the node to open the right-side panel
- Click Raw Download for a CSV copy without the System Unique ID or Date added columns in the dataset
Refreshing the Workflow
To run the workflow and create a new output:
- Run the collection e.g. Department of Labor (ensure the Update method is set to replace)
- Click on the Arria Collect App node (ensure the Update method is set to replace) and click Run
- Download the file from the right-side panel
Updated 2 months ago
