Gatsby-Source-Filesystem: Install How To Use
Gatsby-Source-Filesystem: Install How To Use
Gatsby-Source-Filesystem: Install How To Use
A Gatsby source plugin for sourcing data into your Gatsby application from your
local filesystem.
The plugin creates File nodes from files. The various “transformer” plugins can
transform File nodes into various other types of data e.g. gatsby-transformer-
json transforms JSON files into JSON data nodes and gatsby-transformer-
remark transforms markdown files into MarkdownRemark nodes from which you can
query an HTML representation of the markdown.
Install
npm install gatsby-source-filesystem
How to use
// In your gatsby-config.js
module.exports = {
plugins: [
// You can have multiple instances of this plugin
// to read source nodes from different locations on your
// filesystem.
//
// The following sets up the Jekyll pattern of having a
// "pages" directory for Markdown files and a "data" directory
// for `.json`, `.yaml`, `.csv`.
{
resolve: `gatsby-source-filesystem`,
options: {
name: `pages`,
path: `${__dirname}/src/pages/`,
},
},
{
resolve: `gatsby-source-filesystem`,
options: {
name: `data`,
path: `${__dirname}/src/data/`,
ignore: [`**/\.*`], // ignore files starting with a dot
},
},
],
}
Options
In addition to the name and path parameters you may pass an
optional ignore array of file globs to ignore.
**/*.un~
**/.DS_Store
**/.gitignore
**/.npmignore
**/.babelrc
**/yarn.lock
**/node_modules
../**/dist/**
How to query
You can query file nodes like the following:
{
allFile {
edges {
node {
extension
dir
modifiedTime
}
}
}
}
{
allFile(filter: { sourceInstanceName: { eq: "data" } }) {
edges {
node {
extension
dir
modifiedTime
}
}
}
}
Helper functions
gatsby-source-filesystem exports three helper functions:
createFilePath
createRemoteFileNode
createFileNodeFromBuffer
createFilePath
When building pages from files, you often want to create a URL from a file’s path
on the file system. E.g. if you have a markdown file at src/content/2018-01-23-an-
exploration-of-the-nature-of-reality/index.md, you might want to turn that
into a page on your site at example.com/2018-01-23-an-exploration-of-the-
nature-of-reality/. createFilePath is a helper function to make this task easier.
createFilePath({
// The node you'd like to convert to a path
// e.g. from a markdown, JSON, YAML file, etc
node,
// Method used to get a node
// The parameter from `onCreateNode` should be passed in here
getNode,
// The base path for your files.
// It is relative to the `options.path` setting in the `gatsby-source-
filesystem` entries of your `gatsby-config.js`.
// Defaults to `src/pages`. For the example above, you'd use `src/content`.
basePath,
// Whether you want your file paths to contain a trailing `/` slash
// Defaults to true
trailingSlash,
})
Example usage
createRemoteFileNode
When building source plugins for remote data sources such as headless CMSs, their
data will often link to files stored remotely that are often convenient to download
so you can work with them locally.
createRemoteFileNode({
// The source url of the remote file
url: `https://example.com/a-file.jpg`,
// The id of the parent node (i.e. the node to which the new remote File node
will be linked to.
parentNodeId,
// Gatsby's cache which the helper uses to check if the file has been
downloaded already. It's passed to all Node APIs.
getCache,
// OPTIONAL
// Adds htaccess authentication to the download request if passed in.
auth: { htaccess_user: `USER`, htaccess_pass: `PASSWORD` },
// OPTIONAL
// Adds extra http headers to download request if passed in.
httpHeaders: { Authorization: `Bearer someAccessToken` },
// OPTIONAL
// Sets the file extension
ext: ".jpg",
})
Example usage
exports.downloadMediaFiles = ({
nodes,
getCache,
createNode,
createNodeId,
_auth,
}) => {
nodes.map(async node => {
let fileNode
// Ensures we are only processing Media Files
// `wordpress__wp_media` is the media file type name for WordPress
if (node.__type === `wordpress__wp_media`) {
try {
fileNode = await createRemoteFileNode({
url: node.source_url,
parentNodeId: node.id,
getCache,
createNode,
createNodeId,
auth: _auth,
})
} catch (e) {
// Ignore
}
}
The file node can then be queried using GraphQL. See an example of this in
the gatsby-source-wordpress README where downloaded images are queried
using gatsby-transformer-sharp to use in the component gatsby-image.
Retrieving the remote file name and extension
The helper tries first to retrieve the file name and extension by parsing the url and
the path provided (e.g. if the url is https://example.com/image.jpg, the extension
will be inferred as .jpg and the name as image). If the url does not contain an
extension, we use the file-type package to infer the file type. Finally, the name
and the extension can be explicitly passed, like so:
createRemoteFileNode({
// The source url of the remote file
url: `https://example.com/a-file-without-an-extension`,
parentNodeId: node.id,
getCache,
createNode,
createNodeId,
// if necessary!
ext: ".jpg",
name: "image",
})
createFileNodeFromBuffer
When working with data that isn’t already stored in a file, such as when querying
binary/blob fields from a database, it’s helpful to cache that data to the filesystem
in order to use it with other transformers that accept files as input.
Example usage
The following example is adapted from the source of gatsby-source-mysql:
// gatsby-node.js
const createMySqlNodes = require(`./create-nodes`)
try {
queries
.map((query, i) => ({ ...query, ___sql: results[i] }))
.forEach(result =>
createMySqlNodes(result, results, createNode, {
createNode,
createNodeId,
getCache,
})
)
db.end()
} catch (e) {
console.error(e)
db.end()
}
}
// create-nodes.js
const { createFileNodeFromBuffer } = require(`gatsby-source-filesystem`)
const createNodeHelpers = require(`gatsby-node-helpers`).default
node[key] = value
}
node = ctx.createNode(node)
module.exports = createMySqlNodes
Troubleshooting
In case that due to spotty network, or slow connection, some remote files fail to
download. Even after multiple retries and adjusting concurrent downloads, you can
adjust timeout and retry settings with these environment variables:
GATSBY_STALL_RETRY_LIMIT, default: 3
GATSBY_STALL_TIMEOUT, default: 30000
GATSBY_CONNECTION_TIMEOUT, default: 30000