About This Page
This page is part of the Azure documentation. It contains code examples and configuration instructions for working with Azure services.
Bias Analysis
Bias Types:
⚠️
windows_first
⚠️
powershell_heavy
⚠️
missing_linux_example
Summary:
The documentation demonstrates a Windows bias by exclusively using Visual Studio (a Windows-centric IDE) for all development, build, and deployment steps. It recommends Windows tools (Visual Studio, Data Lake Tools for Visual Studio) and provides detailed, step-by-step instructions for these, while omitting equivalent Linux-native workflows (e.g., using Mono on Linux, building with dotnet CLI or msbuild, uploading via Azure CLI, or running jobs via SSH). Although the document states that HDInsight clusters are Linux-based and briefly mentions SSH for running Pig jobs, it does not provide Linux-based development or deployment examples, nor does it mention cross-platform alternatives in the main workflow.
Recommendations:
- Provide parallel instructions for Linux users, including how to build C# projects using the dotnet CLI or msbuild on Linux with Mono.
- Include examples of uploading files to HDInsight using Azure CLI or azcopy, not just Visual Studio/Data Lake Tools.
- Demonstrate how to run Hive queries using Azure CLI, Beeline, or SSH, instead of only through Visual Studio.
- Mention and show how to use cross-platform editors like VS Code or JetBrains Rider, and clarify which steps are Windows-specific.
- Explicitly state that Visual Studio steps are optional and provide equivalent command-line or Linux-native alternatives.
- Add a section or callouts for Mac/Linux users, ensuring parity in development, deployment, and job submission workflows.
Create pull request
Flagged Code Snippets
-- Uncomment the following if you are using Azure Storage
-- add file wasbs:///HiveCSharp.exe;
-- Uncomment the following if you are using Azure Data Lake Storage Gen1
-- add file adl:///HiveCSharp.exe;
-- Uncomment the following if you are using Azure Data Lake Storage Gen2
-- add file abfs:///HiveCSharp.exe;
SELECT TRANSFORM (clientid, devicemake, devicemodel)
USING 'HiveCSharp.exe' AS
(clientid string, phoneLabel string, phoneHash string)
FROM hivesampletable
ORDER BY clientid LIMIT 50;
DEFINE streamer `PigUDF.exe` CACHE('/PigUDF.exe');
LOGS = LOAD '/example/data/sample.log' as (LINE:chararray);
LOG = FILTER LOGS by LINE is not null;
DETAILS = STREAM LOG through streamer as (col1, col2, col3, col4, col5);
DUMP DETAILS;