Idea

A repository will be created in which the price history of Azure can be tracked. Since the API for pay-as-you-go prices does not provide historical data, a repository with past prices should be created as CSV files. These files can then be further processed for analysis purposes.

Notes and problems

The price list is approx. 200MB in size. The size of the files on GitHub is limited to 50MB. Although GitHub supports LFS, allowing a maximum file size of 2GB, repos with large files lock faster and are harder to manage. Therefore, LFS was omitted and the list was divided into approx. 40MB files (with a maximum of 125,000 records).

By default, the price list is delivered as JSON from the API and contains a nested structure with MeterID and other details. The nested structure was cleaned up and saved as a CSV. This means that the amount of data is smaller and the values are easier to process.

The API returns with the header value x-ms-ratelimit-remaining-retailPrices-requests whether the request limit has been reached and with x-ms-ratelimit-retailPrices-retry-after how long the waiting time until the next request is. Unfortunately the API does not use this header values and throws status code 429 instead.

The workflow is not based on the latest API. This means that the workflow does not take new features, such as Saving Plans. However, since it is mainly about the relationship between costs over time, this can be ignored.

Implementation

The repository contains the download script as a GitHub workflow. It starts automatically every 5th day of the month at 6am (line 5). The first data is downloaded in lines 32 to 38. If throttling kicks in, the loop starts again after 61 seconds (lines 39 to 43). The date of the retrieval is added in lines 44 to 47. The data is merged in lines 48 to 59 and saved to a file when $linestosplitfiles is exceeded. The URL for subsequent records is set in line 61 and the next loop run is carried out if the URL exists. In lines 62 to 66, the original throttling mechanism is implemented, which reads the appropriate header values. At the end, the last file is exported and the files are commit.

name: download-pricelist

on:
  schedule:
  - cron: "0 6 5 * *"
  workflow_dispatch:

permissions:
  contents: write

jobs:
  download-and-checkin-azure-prices:
    runs-on: windows-latest

    steps:
      - uses: actions/checkout@v4

      - name: 'Get Azure Prices and commit'
        shell: pwsh
        run: |
          git config --global user.name 'Workflow'
          git config --global user.email 'tzuehlke@users.noreply.github.com'
          Write-Output "Downloading MCA Prices..."
          $countalllines = 0
          $countfilelines = 0
          $countfiles = 0
          $linestosplitfiles = 125000
          $url = "https://prices.azure.com/api/retail/prices?api-version=2021-10-01-preview&currencyCode='EUR'"
          $prices = @()
          $start = Get-Date
          do{
              $resp = Invoke-RestMethod `
                  -Method Get `
                  -ContentType "application/json; charset=utf-8" `
                  -ResponseHeadersVariable respHeaders `
                  -SkipHttpErrorCheck `
                  -StatusCodeVariable "respStatusCode" `
                  -Uri $url
              if($respStatusCode -eq "429"){
                Write-Output "waiting for 61 seconds because of 429 response"
                Start-Sleep -Seconds (61)
                continue
              }
              $currentDate = Get-Date
              $resp.Items | % {
                  Add-Member -InputObject $_ -NotePropertyName "retrievalDate" -NotePropertyValue $currentDate
              }
              $prices += $resp.Items
              $countalllines += $resp.Count
              $countfilelines += $resp.Count
              if($countfilelines -ge $linestosplitfiles){
                  Write-Output "start export of $($prices.Count)"
                  $filename = "mca-$($start.ToString("yyyy-MM-dd_HH-mm-ss"))_p$('{0:d2}' -f $countfiles)"
                  #$prices | ConvertTo-Json | Out-File "$filename.json"
                  $prices | Export-Csv -Path "$filename.csv" -Delimiter ';' -NoTypeInformation -Encoding UTF8
                  $countfilelines = 0
                  $countfiles++
                  $prices = @()
              }
              Write-Output "retrieved $countalllines, open calls tills timeout $($respHeaders."x-ms-ratelimit-remaining-retailPrices-requests")"
              $url = $resp.NextPageLink
              if($respHeaders."x-ms-ratelimit-remaining-retailPrices-requests" -eq 0){
                  $wait = [int]($respHeaders."x-ms-ratelimit-retailPrices-retry-after"[0])
                  Write-Output "waiting $wait seconds"
                  Start-Sleep -Seconds ($wait)
              }
          }while($url)
          Write-Output "start export of $($prices.Count)"
          $filename = "mca-$($start.ToString("yyyy-MM-dd_HH-mm-ss"))_p$('{0:d2}' -f $countfiles)"
          $prices | Export-Csv -Path "$filename.csv" -Delimiter ';' -NoTypeInformation -Encoding UTF8
          $end = Get-Date
          Write-Output "started $start"
          Write-Output "end $end"
          git status -s
          git add -A
          git commit -m "Added price lists by workflow"
          git push