4.04 Terraform Data Sources
Overview
Data sources let Terraform read attributes from infrastructure it doesn't manage — resources created manually, by other IaC tools, or by a separate Terraform configuration — and use that data inside its own managed resources.
Abstract
A data block is Terraform's read-only counterpart to a resource block. It fetches information about an existing object without creating, updating, or destroying it, making that information available elsewhere in the configuration via the data object.
Why It Matters in Production
Real infrastructure is rarely managed by a single tool. Resources may exist from Puppet, CloudFormation, SaltStack, Ansible, ad-hoc scripts, manual provisioning, or even a different Terraform state. Data sources let a Terraform configuration reference values from those external resources — like a manually-provisioned database's host address — without taking ownership of them, avoiding duplicated infrastructure or unsafe imports.
Key Concepts
| Concept | Description |
|---|---|
Data source (data block) |
Reads attributes from an existing resource; does not create, update, or destroy it |
Managed resource (resource block) |
Creates, updates, and destroys infrastructure; fully owned by Terraform |
data object |
Where attribute values read by a data source are exposed for use elsewhere in configuration |
| Exported attributes | The specific fields a given data source makes available, defined per-provider in the Terraform Registry docs |
Common Use Cases
- Referencing a manually-provisioned AWS database's host, name, or user in a Terraform-managed application resource.
- Pulling values from infrastructure created outside Terraform's control — ad-hoc scripts, other config management tools, or a separate Terraform state directory.
- Looking up cloud provider metadata (e.g. the latest AMI ID, an existing VPC ID, or an existing IAM policy) without managing the lifecycle of that object.
- Reading a locally-created file's content to feed into a Terraform-managed resource, as in the
local_fileexample below.
Example Configuration or Commands
Resource managed outside Terraform
A file is created independently of Terraform, for example via a shell script:
Terraform has no knowledge of this file — it exists only in "real world infrastructure," not in terraform.tfstate.
Reading it with a data source
data "local_file" "dog" {
filename = "/root/dog.txt"
}
resource "local_file" "pet" {
filename = "/root/pets.txt"
content = data.local_file.dog.content
}
- The
datakeyword replacesresourceto declare a read-only lookup. - The resource type (
local_file) follows, same as in a managed resource block. - The logical name (
dog) is used to reference this data elsewhere. - Arguments inside the block (
filename) are specific to that data source — check the Terraform Registry provider docs for which arguments and exported attributes apply.
The value is referenced elsewhere in configuration as:
For the local_file data source, the Terraform Registry documents two exported attributes: content and the base64-encoded equivalent.
Resource vs. data source comparison
| Resource | Data Source | |
|---|---|---|
| Keyword | resource |
data |
| Capability | Creates, updates, destroys infrastructure | Only reads infrastructure |
| Also known as | Managed resource | Data resource |
Best Practices
Best Practices
- Check the Terraform Registry provider documentation for each data source's required arguments and exported attributes before use — these vary per resource type and provider.
- Use data sources instead of hardcoding values (IDs, ARNs, file contents) that come from infrastructure outside the current configuration.
- Prefer a data source over importing a resource into state when you only need to read its attributes, not manage its lifecycle.
- Keep data source lookups scoped narrowly (a single file, a single instance) rather than broad queries that could return ambiguous or multiple results.
Security Best Practices
Security
- Data sources can expose sensitive attributes (e.g. database credentials, secrets stored in tags) into Terraform's plan output and state file — treat that output with the same care as managed resource state.
- Be cautious referencing data sources that read from files or systems writable by other processes; unexpected content changes will flow directly into your managed resources on the next apply.
- Restrict who can modify the externally-managed resource a data source reads from, since Terraform has no control over (or audit trail for) those changes.
Do and Don't
| ✅ Do | ❌ Don't |
|---|---|
| Use a data source to read attributes from externally-managed infrastructure | Manually duplicate values from external resources into your config |
| Check the Registry docs for exported attributes before referencing them | Guess at attribute names and assume they're consistent across providers |
Reference data via data.<type>.<name>.<attribute> |
Confuse a data source's logical name with a managed resource of the same type |
| Use data sources for true read-only lookups | Use a data source when you actually need Terraform to manage the resource's lifecycle |
Common Mistakes
Common Mistakes
- Forgetting the
datakeyword and accidentally declaring a second managedresourceblock instead, which would attempt to create or conflict with the external resource. - Assuming every resource type's data source exposes the same attributes as its managed resource counterpart — exported attributes are defined per data source.
- Not realizing that changes to the externally-managed resource (like the
dog.txtfile content) will flow into the dependent managed resource on the nextterraform apply.
Troubleshooting
# Confirm what a data source is currently reading
terraform plan
# Inspect the resolved value of a data source after apply
terraform show
# Check the Terraform Registry for a provider's data source arguments and exports
# (no CLI command — refer to registry.terraform.io/providers/<provider>/latest/docs/data-sources)
Real-World Examples
Platform Team — Bridging Manually-Provisioned Databases
Scenario: A company had a production RDS database provisioned manually before adopting Terraform. Problem: New application infrastructure needed the database's host address and name, but the team didn't want to import and take ownership of the existing database in Terraform. Solution: Used a data source to read the existing database's attributes and passed them into a Terraform-managed application resource's connection configuration. Outcome: New infrastructure could reference the existing database safely without risking accidental modification or deletion of a resource Terraform didn't own.
Multi-Tool Infrastructure — Reading Ansible-Managed Config
Scenario: A hybrid environment where base servers were configured with Ansible, but application-layer resources were defined in Terraform. Problem: Terraform-managed resources needed values (like a generated config file's contents) that only existed after Ansible runs completed. Solution: Used a data source to read the Ansible-generated file and feed its content into the dependent Terraform resource. Outcome: The two tools coexisted without Terraform attempting to manage resources outside its scope.
Cloud Migration Team — Cross-State Data Sharing
Scenario: A large organization split infrastructure across multiple separate Terraform configurations (state files) by team. Problem: A networking team's VPC needed to be referenced by an application team's configuration, but merging state files wasn't desirable. Solution: Used data sources to look up the VPC's attributes from the networking team's already-provisioned resources, rather than duplicating the VPC definition. Outcome: Teams maintained independent state files while safely sharing required infrastructure values.
Quick Recap
- Data sources read attributes from resources Terraform doesn't manage, using the
datakeyword instead ofresource. - They never create, update, or destroy infrastructure — only read it.
- Retrieved values are accessed via
data.<type>.<logical_name>.<attribute>. - Each data source's required arguments and exported attributes are defined per provider in the Terraform Registry.
- Resources are "managed resources"; data sources are also called "data resources."
Interview / Revision Notes
- Q: What's the main difference between a resource and a data source? Resources create, update, and destroy infrastructure; data sources only read information from existing infrastructure.
- Q: What keyword declares a data source block?
data, followed by the resource type and a logical name. - Q: How do you reference a value read by a data source elsewhere in configuration?
data.<resource_type>.<logical_name>.<attribute>. - Q: Where do you find which arguments and attributes a data source supports? The Terraform Registry documentation for that provider's data sources.
- Q: What is another name for a managed resource? For a data source? Managed resource; data resource.