Motivation
The Problem
The built environment is full of data. Layer structure, geometry, material properties, object attributes, energy performance figures, sensor readings linked to physical elements, all of it sits inside proprietary file formats that the rest of the data science world can’t easily reach. The tools that can read these files (AutoCAD, Revit, Rhino) are desktop applications built for design, not analysis. They’re not going to produce a ggplot2 chart or feed a Shiny dashboard.
The result is an unnecessary gap. Engineers and architects generate rich, spatially-grounded data every day. Data scientists who could turn that data into insight have no good way to access it. Both teams end up frustrated, exporting things to CSV by hand and losing half the structure in the process.
The Opportunity
R has quietly become the lingua franca for quantitative work in architecture, engineering, and construction (AEC). Structural engineers use it to crunch sensor data. Sustainability consultants model energy performance in it. BIM managers audit model quality across large portfolios with it. And Shiny brings all of that to non-technical stakeholders through interactive dashboards.
What’s been missing is a clean bridge between R’s data science ecosystem and the 3D model data locked inside CAD files. A DWG file isn’t just a drawing, it’s geometry, layer structure, material properties, and embedded attributes that are genuinely interesting to analyse. An as-built point cloud from a drone survey can be compared against a design mesh to quantify construction deviations. A BIM model wired to live sensor data becomes a digital twin that a Shiny app can query in real time.
The Solution
This book builds that bridge using the AutoDesk Platform Services (APS) cloud API as its primary engine. AutoDesk is the dominant platform in the AEC industry, and APS can translate design files between dozens of formats, extract rich geometry and metadata from BIM models, automate DWG processing at scale, and render 3D models interactively in a browser, all without a local AutoCAD or Revit installation.
The catch is that APS speaks REST. Every operation means authenticated HTTP requests, base64-encoded URNs, polling asynchronous job queues, and parsing deeply nested JSON. AutoDeskR (Govan 2024) wraps all of that in idiomatic R functions. Authentication, encoding, polling — handled. A workflow that would otherwise mean dozens of raw HTTP calls comes down to a handful of lines:
library(AutoDeskR)
token <- getToken(id = id, secret = secret, scope = "data:read data:write bucket:create")
bucket <- makeBucket(token = token, bucket = "my-project-bucket")
upload <- uploadFile(file = "model.dwg", token = token, bucket = bucket$bucketKey)
job <- translateFile(urn = upload$objectId, token = token, type = "svf")The goal isn’t to paper over APS entirely. Understanding what’s happening under the hood makes you more effective. AutoDeskR just removes the boilerplate so you can spend your time on the analysis, not the plumbing.
Why R?
Fair question. Most AEC professionals use Python, .NET, or whatever scripting language AutoCAD yells at you in. So why drag R into this?
Because the audience isn’t architects, it’s analysts. The real use case here is the data scientist who gets handed a folder of DWG files and told to “make a dashboard out of this.” That person lives in R. They know dplyr, they know ggplot2, they know Shiny. AutoDeskR gives them a bridge to the CAD world without forcing them to learn a whole new stack.
Because the Python SDK is for building applications. It’s great if you’re writing production software. But if you want to pull geometry data, run some summary statistics, and produce a polished report with a table and a plot, R’s ecosystem (ggplot2, gt, Quarto) is still unmatched. The wrapper isn’t competing with the Python SDK; it’s serving a different job.
Because thin wrappers encode real expertise. Yes, a sophisticated user could call the APS REST API directly with httr2. But do they know that the object URN needs to be Base64-encoded before passing it to the Model Derivative API? That tokens expire after exactly 3600 seconds? That bucket names are globally unique across all APS applications? AutoDeskR quietly handles all of that so you can focus on the analysis, not the plumbing.
Because API drift is a fact of life, not a dealbreaker. AutoDesk updates its APIs. So do Google, AWS, and everyone else. Staying close to the API surface means fewer moving parts to break, and direct httr2 calls would have exactly the same maintenance problem with more boilerplate.
Because the free tier is for experimenting, and that’s the whole point. Once you’ve prototyped something worth running in production, you’re almost certainly inside an organisation that’s already paying for AutoDesk products, and API access comes with the subscription.
Why This Book Still Matters in the Age of AI
LLMs can write R code but can’t validate it against live APIs. An AI assistant can suggest getToken() but doesn’t know that bucket names are globally unique across every APS application in existence, that tokens expire in exactly 3600 seconds, or that object URNs must be Base64-encoded before passing them to the Model Derivative API. The working patterns in this book are what an AI coding assistant needs to get the details right on the first try, not after three rounds of debugging cryptic 400 errors.
AutoDeskR as an MCP server turns AI agents into built-environment tools. The MCP Server chapter shows how to expose AutoDeskR as a tool that LLMs like Claude can call directly, letting an AI agent query live building data, pull sensor streams from a digital twin, or trigger a DWG-to-PDF conversion through natural language. Understanding the underlying API is what lets you decide which capabilities to expose and how to shape what those agents can actually do.
The data bridge problem hasn’t gone away. AI doesn’t change the fact that AEC data lives in proprietary formats on proprietary platforms. An LLM that can’t reach a DWG file, a BIM model, or a sensor stream can’t reason about it. AutoDeskR remains the bridge that puts that data in front of both R analysts and AI systems.
Structured wrappers reduce hallucination surface. Thin, well-named functions with clear arguments give AI coding tools a smaller surface to get wrong. Calling makeBucket(token, bucket, policy) is far harder to hallucinate badly than constructing a raw httr2 request with the correct endpoint, authentication header, content type, and JSON body shape.
Who This Book Is For
This book is for R users who work with or alongside CAD and BIM data and want to automate, analyse, or visualise it without leaving the R environment. The techniques here, mesh analysis, layer analytics, digital twin patterns, point cloud processing, apply to BIM and CAD data broadly, not just data that came from AutoDesk software. AutoDesk is where we start because that’s where most of the industry’s data currently lives, but the analytical mindset carries over wherever the files come from.
No prior experience with APS, AutoDesk, or the AEC industry is assumed. If you can install an R package and run a script, you have everything you need.