About This Page
This page is part of the Azure documentation. It contains code examples and configuration instructions for working with Azure services.
Bias Analysis
Bias Types:
⚠️
windows_first
⚠️
powershell_heavy
⚠️
windows_tools
⚠️
missing_linux_example
Summary:
The documentation page exhibits a strong Windows bias. All local environment setup instructions, directory paths, and file operations use Windows Command Prompt syntax (e.g., `cmd`, `DEL`, `notepad`, `C:\HDI`). The only text editor mentioned is Microsoft Notepad. The PowerShell workflow is described in detail, but there are no equivalent Linux/Bash examples for local development, file editing, or uploading files to HDInsight. Linux users are left to infer the necessary commands and tools, despite the fact that HDInsight clusters themselves are Linux-based.
Recommendations:
- Provide parallel Linux/Bash examples for all command-line instructions, including directory creation, file deletion, and file editing (e.g., use `mkdir`, `rm`, `nano`/`vim`).
- Mention and demonstrate the use of cross-platform or Linux-native text editors (e.g., nano, vim, gedit) alongside Notepad.
- Include a section or callouts for Linux/macOS users, showing how to perform all steps (project setup, file editing, Maven usage, SCP/SSH) in a Unix-like environment.
- Offer Bash scripts or command blocks for uploading JAR files and running jobs, similar to the detailed PowerShell module, possibly using Azure CLI or direct SSH/SCP.
- Reorder or balance the presentation so that Linux and Windows instructions are given equal prominence, or provide tabs/switchers for each platform.
- Clarify in the prerequisites and test environment sections that the instructions are Windows-specific, and direct Linux/macOS users to the appropriate alternative steps.
Create pull request
Flagged Code Snippets
IF NOT EXIST C:\HDI MKDIR C:\HDI
cd C:\HDI
notepad src\main\java\com\microsoft\examples\SearchByEmail.java
notepad src\main\java\com\microsoft\examples\DeleteTable.java
Start-HBaseExample -className com.microsoft.examples.CreateTable -clusterName $myCluster
DEL src\main\java\com\microsoft\examples\App.java
DEL src\test\java\com\microsoft\examples\AppTest.java
notepad src\main\java\com\microsoft\examples\CreateTable.java
<#
.SYNOPSIS
Copies a file to the primary storage of an HDInsight cluster.
.DESCRIPTION
Copies a file from a local directory to the blob container for
the HDInsight cluster.
.EXAMPLE
Start-HBaseExample -className "com.microsoft.examples.CreateTable"
-clusterName "MyHDInsightCluster"
.EXAMPLE
Start-HBaseExample -className "com.microsoft.examples.SearchByEmail"
-clusterName "MyHDInsightCluster"
-emailRegex "contoso.com"
.EXAMPLE
Start-HBaseExample -className "com.microsoft.examples.SearchByEmail"
-clusterName "MyHDInsightCluster"
-emailRegex "^r" -showErr
#>
function Start-HBaseExample {
[CmdletBinding(SupportsShouldProcess = $true)]
param(
#The class to run
[Parameter(Mandatory = $true)]
[String]$className,
#The name of the HDInsight cluster
[Parameter(Mandatory = $true)]
[String]$clusterName,
#Only used when using SearchByEmail
[Parameter(Mandatory = $false)]
[String]$emailRegex,
#Use if you want to see stderr output
[Parameter(Mandatory = $false)]
[Switch]$showErr
)
Set-StrictMode -Version 3
# Is the Azure module installed?
FindAzure
# Get the login for the HDInsight cluster
$creds=Get-Credential -Message "Enter the login for the cluster" -UserName "admin"
# The JAR
$jarFile = "wasb:///example/jars/hbaseapp-1.0-SNAPSHOT.jar"
# The job definition
$jobDefinition = New-AzHDInsightMapReduceJobDefinition `
-JarFile $jarFile `
-ClassName $className `
-Arguments $emailRegex
# Get the job output
$job = Start-AzHDInsightJob `
-ClusterName $clusterName `
-JobDefinition $jobDefinition `
-HttpCredential $creds
Write-Host "Wait for the job to complete ..." -ForegroundColor Green
Wait-AzHDInsightJob `
-ClusterName $clusterName `
-JobId $job.JobId `
-HttpCredential $creds
if($showErr)
{
Write-Host "STDERR"
Get-AzHDInsightJobOutput `
-Clustername $clusterName `
-JobId $job.JobId `
-HttpCredential $creds `
-DisplayOutputType StandardError
}
Write-Host "Display the standard output ..." -ForegroundColor Green
Get-AzHDInsightJobOutput `
-Clustername $clusterName `
-JobId $job.JobId `
-HttpCredential $creds
}
<#
.SYNOPSIS
Copies a file to the primary storage of an HDInsight cluster.
.DESCRIPTION
Copies a file from a local directory to the blob container for
the HDInsight cluster.
.EXAMPLE
Add-HDInsightFile -localPath "C:\temp\data.txt"
-destinationPath "example/data/data.txt"
-ClusterName "MyHDInsightCluster"
.EXAMPLE
Add-HDInsightFile -localPath "C:\temp\data.txt"
-destinationPath "example/data/data.txt"
-ClusterName "MyHDInsightCluster"
-Container "MyContainer"
#>
function Add-HDInsightFile {
[CmdletBinding(SupportsShouldProcess = $true)]
param(
#The path to the local file.
[Parameter(Mandatory = $true)]
[String]$localPath,
#The destination path and file name, relative to the root of the container.
[Parameter(Mandatory = $true)]
[String]$destinationPath,
#The name of the HDInsight cluster
[Parameter(Mandatory = $true)]
[String]$clusterName,
#If specified, overwrites existing files without prompting
[Parameter(Mandatory = $false)]
[Switch]$force
)
Set-StrictMode -Version 3
# Is the Azure module installed?
FindAzure
# Get authentication for the cluster
$creds=Get-Credential
# Does the local path exist?
if (-not (Test-Path $localPath))
{
throw "Source path '$localPath' does not exist."
}
# Get the primary storage container
$storage = GetStorage -clusterName $clusterName
# Upload file to storage, overwriting existing files if -force was used.
Set-AzStorageBlobContent -File $localPath `
-Blob $destinationPath `
-force:$force `
-Container $storage.container `
-Context $storage.context
}
function FindAzure {
# Is there an active Azure subscription?
$sub = Get-AzSubscription -ErrorAction SilentlyContinue
if(-not($sub))
{
Connect-AzAccount
}
}
function GetStorage {
param(
[Parameter(Mandatory = $true)]
[String]$clusterName
)
$hdi = Get-AzHDInsightCluster -ClusterName $clusterName
# Does the cluster exist?
if (!$hdi)
{
throw "HDInsight cluster '$clusterName' does not exist."
}
# Create a return object for context & container
$return = @{}
$storageAccounts = @{}
# Get storage information
$resourceGroup = $hdi.ResourceGroup
$storageAccountName=$hdi.DefaultStorageAccount.split('.')[0]
$container=$hdi.DefaultStorageContainer
$storageAccountKey=(Get-AzStorageAccountKey `
-Name $storageAccountName `
-ResourceGroupName $resourceGroup)[0].Value
# Get the resource group, in case we need that
$return.resourceGroup = $resourceGroup
# Get the storage context, as we can't depend
# on using the default storage context
$return.context = New-AzStorageContext -StorageAccountName $storageAccountName -StorageAccountKey $storageAccountKey
# Get the container, so we know where to
# find/store blobs
$return.container = $container
# Return storage accounts to support finding all accounts for
# a cluster
$return.storageAccount = $storageAccountName
$return.storageAccountKey = $storageAccountKey
return $return
}
# Only export the verb-phrase things
export-modulemember *-*
cd C:\HDI\hbaseapp
$myCluster = "CLUSTERNAME"
Import-Module .\hbase-runner.psm1
Add-HDInsightFile -localPath target\hbaseapp-1.0-SNAPSHOT.jar -destinationPath example/jars/hbaseapp-1.0-SNAPSHOT.jar -clusterName $myCluster
Start-HBaseExample -className com.microsoft.examples.SearchByEmail -clusterName $myCluster -emailRegex contoso.com
Start-HBaseExample -className com.microsoft.examples.DeleteTable -clusterName $myCluster