Question 2011 Scripting Games : Advanced Event 6

Plus d'informations
il y a 14 ans 9 mois #9699 par Matthew BETTON
Bonsoir,

Ci-après le script que j'ai posté lors des Scripting Games 2011, dans le cadre de l'Advanced Event 3.

Le scénario était le suivant :

You are getting ready to attend a SQL Saturday event in your area. Prior to the event, you want to ensure that you know who is attending the event so you can leverage your networking opportunities. You discover the networking page and see that it has everyone’s Twitter username listed. After a bit of thought, you decide that it would be cool to have a list of everyone’s Twitter usernames in a text file. To do this, you need to write a script to obtain only the Twitter usernames from the networking site, and write the names to a text file. A sample networking page from one of these events is shown in the following image .


Mon script (une solution) :

[code:1]#
# 2011 Scripting Games : Advanced Event 6
# Script : Get-TwitterNamesFromWebPage.ps1
# Author : Matthew BETTON (France / Basse-Normandie / Manche (50))
# Date : 04/12/2011
# Synopsis : Use PowerShell to Get Twitter Names from a Web Page
#

<#
.SYNOPSIS
This script gets Twitter Names from a Web Page.
.DESCRIPTION
The 'Get-TwitterNamesFromWebPage.ps1' finds all Twitter names in the Web page found at specified URI.

This script uses cmdlet binding so it gets automatically the -Debug switch parameter, locally enabling Write-Debug.
Otherwise, if you set the $DebugPreference variable to \"Continue\" before running this script, you will automatically get debugging informations into the console.

.PARAMETER URI
You must specify this parameter (String).
The URI (Uniform Resource Identifier) should be an http address like \"www.sqlsaturday.com/70/networking.aspx\";

.PARAMETER Filepath
You must specify this parameter (String).
Specifies the path to the output text file.

.PARAMETER OpenTextFile
If this switch parameter is used : Automatically open the text file in the text files associated program.

.OUTPUTS
This script returns System.String

.EXAMPLE
C:\PS>.\Get-TwitterNamesFromWebPage -URI \"www.sqlsaturday.com/70/networking.aspx\"; -FilePath \".\twitter.txt\"

This command gets Twitter names from HTTP web page found at the address \"www.sqlsaturday.com/70/networking.aspx\";.
It returns the names to the console and create the '.\Twitter.txt' text file with the Twitter names.

.EXAMPLE
C:\PS>.\Get-TwitterNamesFromWebPage -URI \"www.sqlsaturday.com/70/networking.aspx\"; -FilePath \".\twitter.txt\" -OpenTextFile

This command gets Twitter names from HTTP web page at the address \"www.sqlsaturday.com/70/networking.aspx\"; .
It returns the names to the console and create the '.\Twitter.txt' text file with the Twitter names.
Then the text file is opened in notepad.

.EXAMPLE
C:\PS>$DebugPreference = \"Continue\"
C:\PS>.\Get-TwitterNamesFromWebPage -URI \"www.sqlsaturday.com/70/networking.aspx\"; -FilePath \".\twitter.txt\"

The first command sets the $DebugPrefenrence variable to \"continue\", so debugging informations will be written to the Console
The second command returns the Twitter names to the console and create the '.\Twitter.txt' text file with the Twitter names.

.EXAMPLE
C:\PS>.\$TwitterNames = .\Get-TwitterNamesFromWebPage -URI \"www.sqlsaturday.com/70/networking.aspx\"; -FilePath \".\twitter.txt\"
C:\PS>.\$TwitterNames.count
C:\PS>.\$TwitterNames | Where-Object {$_ -like \"*sql*\"}

The first command returns the Twitter names to a variable and create the '.\Twitter.txt' text file with the Twitter names.
The second command displays the number of Twitter names found in the Web page.
The last command finds all the Twitter names that contain 'sql'.

#>

[CmdletBinding()]
param(
[Parameter(Mandatory=$true)]
[String]$URI,
[Parameter(Mandatory=$true)]
[String]$FilePath,
[Switch]$OpenTextFile = $False
)

Write-Debug \">> Function 'Get-TwitterNamesFromWebPage' Called\"
Write-Debug \">> URI : '$URI'\"
Write-Debug \">> Filepath : '$FilePath'\"
Write-Debug \">> OpenTextFile value : '$OpenTextFile'\"

Write-Debug \"Creating HTTP web request ...\"
$req = [System.Net.HttpWebRequest]::Create($URI)

Write-Debug \"Trying to get response from '$URI' ...\"
try {
$res = $req.GetResponse()
}
catch{
Write-Host \"An error has occured while getting response from '$URI' : $($Error[0].Exception.Message)\" -ForegroundColor Red
return $null
}

Write-Debug \"Satus Code : $($res.StatusCode)\"
if($res.StatusCode -eq 200) {
[int]$goal = $res.ContentLength
Write-Debug \"Web response content length : $goal\"
Write-Debug \"Getting response stream ...\"
$reader = $res.GetResponseStream()

Write-Debug \"Preparing to read the response stream ...\"
$encoding = [System.Text.Encoding]::GetEncoding( $res.CharacterSet )
[string]$output = \"\"
[byte[]]$buffer = new-object byte[] 4096
[int]$total = [int]$count = 0

Write-Debug \"Reading the stream until end of stream ...\"
do{
$count = $reader.Read($buffer, 0, $buffer.Length)
$output += $encoding.GetString($buffer,0,$count)
$total += $count
Write-Progress \"Downloading '$URI'\" \"Saving $total of $goal\" -PercentComplete (($total/$goal)*100)
} while ($count -gt 0)

$reader.Close()
}
else{
Write-Host \"Could not access '$URI' (Status Code : $($res.StatusCode))\" -ForegroundColor Red
return $null
}

$res.Close()

if($output){
Write-Debug \"Finding twitter names from HTTP Web page ...\"
# Regular expression to find twitter names from href twitter URI
$pattern = [regex]\"<\s*a\s*[^>]*?href\s*=\s*[`\"'](\b[^>]*twitter.com/(.*?)`\"«»)\"
$names = $pattern.Matches($output)
}
else{
Write-Host \"No output from HTTP Web Page '$URI'\" -ForegroundColor Yellow
}

Write-Debug \"Creating the result tab that will contain Twitter names ...\"
$resultTab = $null
$resultTab = @()

foreach($name in $names){
$resultTab += $name.Groups[2].value
}

Write-Debug \"$($resultTab.count) Twitter names have been found :«»)\"

if($FilePath -ne [System.String]::Empty){
Write-Debug \"Writting Twitter names to '$Filepath' text file ...\"
try{
$resultTab | Out-File -FilePath $FilePath
if($OpenTextFile){
Write-Debug \"Opening '$Filepath' text file ...\"
Invoke-Item -Path $FilePath
}
}
catch{
Write-Host \"A problem has occured while creating / opening text file '$Filepath'\"
}
}

Write-Debug \"Return Twitter names\"
return $resultTab[/code:1]


Pour les personnes que cela intéresse, voici le lien vers la solution de l'expert .

Toutes les remarques seront les bienvenues !

;)

Connexion ou Créer un compte pour participer à la conversation.

Plus d'informations
il y a 14 ans 9 mois #9700 par Matthew BETTON
Pour information, voici les remarques de Tome Tanasovski, à propos de mon script :

So, I'm a bit torn on this one because the reality is that you did not satisfy the requirements and we are only supposed to give you one star in that case. You are specifically told to use the url provided to receive the list. You could have easily handled this by specifying a default value for the parameter or adding one line of code that calls your script properly. That said, I can't bring myself to give you one star. I think Ed Wilson or the other judges would need to overwrite my blatant disregard for the rules, but I'm doing it with the thought that your examples have the answer in them.
The real problem with your script is that you choose such a low level way of handling the http request. To be fair, I did compare your method with the System.Net.Webclient method of downloading the data and your way is a bit slower (.460 seconds compared to .249 seconds). If it was faster I would say that I understand your choice of classes, but there is a better way to do this.
On a side, you have an interesting choice of regex for handling the sqlvariant issue. I hadn't thought of doing it that way.


... et je lui avais répondu ceci :

Thanks Tome for your comment and your great explanation.
I understand completely and I take notes...
I thought about reusable script (regardless of the url) before answering to the scenario of this event.
Mea Culpa :)
I will be very vigilant about other posted solutions, in particular for handling http request.
I thank the scripting games because I learn a lot !

Connexion ou Créer un compte pour participer à la conversation.

Plus d'informations
il y a 14 ans 9 mois #9810 par Laurent Dardenne
Salut Matthew,
deux petites remarques sur la forme, pour cette déclaration :
[code:1][Switch]$OpenTextFile = $False[/code:1]
L'affectation à $False est redondante, car l'absence du switch sur la ligne d'appel est équivalente.

Et pour les classes utilisant l'interface IDisposable, la pratique veut qu'on insére la \"libération\" des ressources dans un bloc Try/Catch/Finally.

Comme quoi Powershell ce n'est pas que du scripting...

On a side, you have an interesting choice of regex for handling the sqlvariant issue. I hadn't thought of doing it that way.

Joli compliment :)

Tutoriels PowerShell

Connexion ou Créer un compte pour participer à la conversation.

Plus d'informations
il y a 14 ans 9 mois #9811 par Matthew BETTON
Salut Laurent,

Merci pour tes commentaires.

Ok pour le switch ...

Pour l'interface IDisposable, effectivement, tu as raison. Cela doit ressembler du coup à quelque chose du genre :

[code:1]$req = [System.Net.HttpWebRequest]::Create($URI)

Write-Debug \"Trying to get response from '$URI' ...\"
try {
$res = $req.GetResponse()
}
catch{
Write-Host \"An error has occured while getting response from '$URI' : $($Error[0].Exception.Message)\" -ForegroundColor Red
return $null
}
finally{
$req.dispose()
}[/code:1]

Le bloque 'finally' s'exécute dans tous les cas, qu'une erreur se produise ou non. Il sert généralement à exécuter du code pour la libération des ressources, le nettoyage...

Selon le MSDN, la méthode 'dispose()'

\"Exécute les tâches définies par l'application associées à la libération ou à la redéfinition des ressources non managées.\"


Plus d'informations à ce sujet ici .

Il est important que cette méthode soit utilisée. Je me rappelle d'ailleurs avoir rencontré des problèmes avec un script PowerShell, pour l'envoie de mails avec des pièces jointes. J'avais des erreurs ('lock' sur les fichiers, il me semble) car j'avais oublié ce fameux 'dispose()'.
Mais on ne s'en aperçoit pas forcément (à priori pas d'erreur avec le code) et il faut donc savoir être vigilant :)

Joli compliment :)

:blush:
Disons que je me suis bien rattrapé en pensant à répondre au problème posé via l'utilisation de regex :P


Encore merci ;)

@+

Matthew<br><br>Message édité par: Matthew BETTON, à: 16/06/11 20:53

Connexion ou Créer un compte pour participer à la conversation.

Plus d'informations
il y a 14 ans 9 mois #9817 par Laurent Dardenne
Matthew BETTON écrit:

Le bloque 'finally' s'exécute dans tous les cas, qu'une erreur se produise ou non.

De ta formulation, il faut se souvenir de l'essentiel : dans tous les cas .

Tutoriels PowerShell

Connexion ou Créer un compte pour participer à la conversation.

Temps de génération de la page : 0.073 secondes
Propulsé par Kunena