Sad Tux - Windows bias detected
This page contains Windows bias

About This Page

This page is part of the Azure documentation. It contains code examples and configuration instructions for working with Azure services.

Bias Analysis

Detected Bias Types
powershell_heavy
windows_first
missing_linux_example
windows_tools
Summary
The documentation page demonstrates a strong Windows bias by providing detailed PowerShell examples and a complete PowerShell script, while omitting equivalent Linux CLI or Bash examples. The primary programmatic approaches highlighted are PowerShell, .NET SDK, and ARM templates, with no mention of Azure CLI or Bash scripting for Linux users. The prerequisites and verification steps also assume a Windows-centric workflow, and there is no guidance for Linux-native tooling or scripting.
Recommendations
  • Add equivalent Azure CLI and Bash script examples for customizing HDInsight clusters, especially for configuration tasks currently shown only in PowerShell.
  • Include Linux prerequisites, such as Azure CLI installation and usage instructions.
  • Present cross-platform approaches (e.g., Azure CLI, REST API) before or alongside Windows-specific tools like PowerShell.
  • Ensure that all sample scripts and configuration steps are available for both Windows and Linux environments.
  • Update verification steps to include methods accessible from Linux (e.g., using curl, jq, or browser-agnostic instructions).
GitHub Create Pull Request

Scan History

Date Scan Status Result
2026-01-14 00:00 #250 in_progress Biased Biased
2026-01-13 00:00 #246 completed Biased Biased
2026-01-11 00:00 #240 completed Biased Biased
2026-01-10 00:00 #237 completed Biased Biased
2026-01-09 00:34 #234 completed Biased Biased
2026-01-08 00:53 #231 completed Biased Biased
2026-01-06 18:15 #225 cancelled Clean Clean
2025-09-16 00:00 #113 completed Clean Clean
2025-09-15 00:00 #112 completed Clean Clean
2025-09-14 00:00 #111 completed Clean Clean
2025-09-13 00:00 #110 completed Clean Clean
2025-09-12 00:00 #109 completed Clean Clean
2025-09-11 00:00 #108 completed Clean Clean
2025-09-10 00:00 #107 completed Clean Clean
2025-09-09 00:00 #106 completed Clean Clean
2025-09-08 00:00 #105 completed Clean Clean
2025-09-07 00:00 #104 completed Clean Clean
2025-09-06 00:00 #103 completed Clean Clean
2025-08-17 00:01 #83 cancelled Clean Clean
2025-07-13 21:37 #48 completed Clean Clean
2025-07-12 23:44 #41 cancelled Biased Biased
2025-07-09 13:09 #3 cancelled Clean Clean
2025-07-08 04:23 #2 cancelled Biased Biased

Flagged Code Snippets

# hive-site.xml configuration
$hiveConfigValues = @{ "hive.metastore.client.socket.timeout"="90s" }

$config = New-AzHDInsightClusterConfig `
         -ClusterType "Spark"  `
    | Set-AzHDInsightDefaultStorage `
        -StorageAccountResourceId "$storageAccountResourceId" `
        -StorageAccountKey $defaultStorageAccountKey `
    | Add-AzHDInsightConfigValue `
        -HiveSite $hiveConfigValues `
        -Spark2Defaults @{}

New-AzHDInsightCluster `
    -ResourceGroupName $resourceGroupName `
    -ClusterName $hdinsightClusterName `
    -Location $location `
    -ClusterSizeInNodes 2 `
    -Version "4.0" `
    -HttpCredential $httpCredential `
    -SshCredential $sshCredential `
    -Config $config
# hdfs-site.xml configuration
$HdfsConfigValues = @{ "dfs.blocksize"="64m" } #default is 128MB in HDI 3.0 and 256MB in HDI 2.1

# core-site.xml configuration
$CoreConfigValues = @{ "ipc.client.connect.max.retries"="60" } #default 50

# mapred-site.xml configuration
$MapRedConfigValues = @{ "mapreduce.task.timeout"="1200000" } #default 600000

# oozie-site.xml configuration
$OozieConfigValues = @{ "oozie.service.coord.normal.default.timeout"="150" }  # default 120
####################################
# Service names and variables
####################################

$nameToken = "<ENTER AN ALIAS>"
$namePrefix = $nameToken.ToLower() + (Get-Date -Format "MMdd")
$resourceGroupName = $namePrefix + "rg"
$hdinsightClusterName = $namePrefix + "hdi"
$defaultStorageAccountName = $namePrefix + "store"
$defaultBlobContainerName = $hdinsightClusterName
$location = "East US"

####################################
# Connect to Azure
####################################

Write-Host "Connecting to your Azure subscription ..." -ForegroundColor Green
$sub = Get-AzSubscription -ErrorAction SilentlyContinue
if(-not($sub))
{
    Connect-AzAccount
}

# If you have multiple subscriptions, set the one to use
#$context = Get-AzSubscription -SubscriptionId "<subscriptionID>"
#Set-AzContext $context

####################################
# Create a resource group
####################################

Write-Host "Creating a resource group ..." -ForegroundColor Green
New-AzResourceGroup `
    -Name  $resourceGroupName `
    -Location $location

####################################
# Create a storage account and container
####################################
	
Write-Host "Creating the default storage account and default blob container ..."  -ForegroundColor Green
New-AzStorageAccount `
    -ResourceGroupName $resourceGroupName `
    -Name $defaultStorageAccountName `
    -Location $location `
    -SkuName Standard_LRS `
    -Kind StorageV2 `
    -EnableHttpsTrafficOnly 1
	
$defaultStorageAccountKey = (Get-AzStorageAccountKey `
                                -ResourceGroupName $resourceGroupName `
                                -Name $defaultStorageAccountName)[0].Value

$defaultStorageContext = New-AzStorageContext `
                                -StorageAccountName $defaultStorageAccountName `
                                -StorageAccountKey $defaultStorageAccountKey
								
New-AzStorageContainer `
    -Name $defaultBlobContainerName `
    -Context $defaultStorageContext #use the cluster name as the container name

####################################
# Create a configuration object
####################################

$hiveConfigValues = @{"hive.metastore.client.socket.timeout"="90s"}
$storageAccountResourceId = (Get-AzStorageAccount -ResourceGroupName $resourceGroupName ` -Name $defaultStorageAccountName).Id

$config = New-AzHDInsightClusterConfig `
          -ClusterType "Spark"  `
    | Set-AzHDInsightDefaultStorage `
        -StorageAccountResourceId "$storageAccountResourceId" `
        -StorageAccountKey $defaultStorageAccountKey `
    | Add-AzHDInsightConfigValue `
        -HiveSite $hiveConfigValues `
		-Spark2Defaults @{}
		
####################################
# Set Ambari admin username/password
####################################

$httpUserName = "admin"  #HDInsight cluster username
$httpPassword = '<ENTER A PASSWORD>'

$httpPW = ConvertTo-SecureString -String $httpPassword -AsPlainText -Force
$httpCredential = New-Object System.Management.Automation.PSCredential($httpUserName,$httpPW)

####################################
# Set ssh username/password
####################################

$sshUserName = "sshuser" #HDInsight ssh user name
$sshPassword = '<ENTER A PASSWORD>'

$sshPW = ConvertTo-SecureString -String $sshPassword -AsPlainText -Force
$sshCredential = New-Object System.Management.Automation.PSCredential($sshUserName,$sshPW)

####################################
# Create an HDInsight cluster
####################################

New-AzHDInsightCluster `
    -ResourceGroupName $resourceGroupName `
    -ClusterName $hdinsightClusterName `
    -Location $location `
    -ClusterSizeInNodes 2 `
    -Version "4.0" `
    -HttpCredential $httpCredential `
    -SshCredential $sshCredential `
    -Config $config
	
####################################
# Verify the cluster
####################################

Get-AzHDInsightCluster `
    -ClusterName $hdinsightClusterName