Remove NFS mounted partitions from Icinga2 output? - linux

We have about 10 servers which an NFS partition is mounted on all of them. All hosts on Icinga displays that NFS partition, so when NFS partition threshold is reached 10 mail notifications are sent for that specific error.
The question is how can I remove NFS partition from different hosts.
For now default config is as below:
apply Service for (display_name => config in host.vars.snmp.disks) {
import "generic-service-faxir"
check_command = "snmp-storage-parameteric"
vars += config
if (vars.snmp_warn == ""){
vars.snmp_warn = "70"
}
if (vars.snmp_crit == ""){
vars.snmp_crit = "85"
}
//Converting capacity to percentage
if(vars.capacity != ""){
if(vars.capacity_warn != ""){
vars.snmp_warn = 100 * vars.capacity_warn / vars.capacity
}
if(vars.capacity_crit != ""){
vars.snmp_crit = 100 * vars.capacity_crit / vars.capacity
}
}
//ext2, ext3, and ext4 has 5% reserved for OS
if (host.vars.os == "Linux"){
vars.snmp_storage_reserved = 5
}
ignore where host.vars.os !in ["Linux", "Windows"]
}
EDIT1:
the command code is as below:
/**
* based on:
* snmp storage - Disk/Memory
* Url reference: http://nagios.manubulon.com/snmp_storage.html
*/
object CheckCommand "snmp-storage-parameteric" {
import "snmp-manubulon-command"
command = [ ManubulonPluginDir + "/check_snmp_storage.pl" ]
arguments += {
"-m" = "$snmp_storage_name$"
"-f" = {
set_if = "$snmp_perf$"
}
"-R" = "$snmp_storage_reserved$"
"-T" = "$snmp_storage_type$"
"-G" = ""
}
vars.snmp_storage_name = "^/$$"
vars.snmp_storage_type = "pu"
vars.snmp_warn = 80
vars.snmp_crit = 90
vars.snmp_perf = true
vars.snmp_storage_reserved=0
}

I haven't tried it, but you could look into the following command parameters:
match a name pattern - https://github.com/dnsmichi/manubulon-snmp/blob/master/plugins/check_snmp_storage.pl#L164
-m, --name=NAME
Name in description OID (can be mounpoints '/home' or 'Swap Space'...)
This is treated as a regexp : -m /var will match /var , /var/log, /opt/var ...
Test it before, because there are known bugs (ex : trailling /)
No trailing slash for mountpoints !
exclude specific volumes - https://github.com/dnsmichi/manubulon-snmp/blob/master/plugins/check_snmp_storage.pl#L180
-e, --exclude
Select all storages except the one(s) selected by -m
No action on storage type selection
select a storage type - https://github.com/dnsmichi/manubulon-snmp/blob/master/plugins/check_snmp_storage.pl#L169
"-q, --storagetype=[Other|Ram|VirtualMemory|FixedDisk|RemovableDisk|FloppyDisk
CompactDisk|RamDisk|FlashMemory|NetworkDisk]
Best is to test the various parameters on the command line and then add them to your CheckCommand and Service definition.

Related

Terraform - ordered generation of resources which are related based on a list variable

I currently try to automate nested SumoLogic forder creation as part of my custom module. I have to use this resource. I need to create a folder path similar to:
parent_folder_path = "SRE/Test/Troubleshooting"
and due to the fact that this variable will change between environments I cannot hardcode creation of the underlying resources. The problematic part is that all shown folders (SRE, Test, Troubleshooting) need to be created in a sequence because the latter needs id of the former (eg. Test folder needs id of already created SRE folder) to be created.
The end result at which I am aiming is automatically generated code as below:
resource "sumologic_folder" "SRE" {
provider = sumologic
name = "SRE"
description = ""
parent_id = "0000000000XXXXX"
}
resource "sumologic_folder" "Test" {
provider = sumologic
name = "Test"
description = ""
parent_id = sumologic_folder.SRE.id
}
resource "sumologic_folder" "Troubleshooting" {
provider = sumologic
name = "Troubleshooting"
description = ""
parent_id = sumologic_folder.Test.id
}
I tried an approach which uses templatefile() and local_file:
parent_directories.tftpl
%{~ for index, path_part in parent_folder_path ~}
%{~ if index == 0 ~}
resource "sumologic_folder" "${replace(path_part, " ", "_")}" {
provider = sumologic
name = "${path_part}"
description = ""
parent_id = "${root_folder_id}"
}
%{~ else }
resource "sumologic_folder" "${replace(path_part, " ", "_")}" {
provider = sumologic
name = "${path_part}"
description = ""
parent_id = sumologic_folder.${replace(parent_folder_path[index - 1], " ", "_")}.id
}
%{~ endif ~}
%{~ endfor ~}
main.tf
resource "local_file" "parent_directories" {
content = templatefile("${path.module}/parent_directories.tftpl", { parent_folder_path = split("/", var.parent_folder_path), root_folder_id = var.root_folder_id })
filename = "${path.module}/parent_directories.tf"
}
and the file was correctly generated during terraform apply run but I was not able to include it in the scope of the run dynamically.
Does anyone know how to handle such usecase?
Thanks in advance for all help.
Best Regards,
Rafal.
I understand what you are trying to achieve - you want to create multiple resources of the same type, but relying on each one of them created before (previous one on the list), at the same time not knowing how many there would be (more folders in path). I am afraid it is not how Terraform works. You would create a cycle between the list or map of the same resources.
That said, however, I can offer you the ugly solution. If you can limit to some number of subdirectories, let's say up to five or ten levels, you can do that in the code that will create three folders if there are three dirs in the path, and four if there are four, and so on. You just stop creating resources if this level is empty.
Let's say you have a sumo module:
variable "parent_path" {}
variable "name" {}
data "sumologic_folder" "parent" {
path = var.parent_path
}
resource "sumologic_folder" "folder" {
provider = sumologic
name = var.name
description = ""
parent_id = data.sumologic_folder.parent.id
}
output "path" {
value = "${var.path}/${var.name}"
}
And then you can split the path to the list of folders and create as many resources as there are folders in the path, for example: AA/BB/CC/DD = 4 sumofolders.
locals {
desired_path = "SRE/Test/Troubleshooting" # example - 3 folders
regex = regexall("[^//]+", local.desired_path)
path0 = "/"
}
module "sumo" {
source = "./sumo"
name = local.regex[0]
parent_path = local.path0 # var.parent_path
}
module "sumo_child_1" {
source = "./sumo"
count = try(local.regex[1], null) == null ? 0 : 1
name = try(local.regex[1], "none")
parent_path = module.sumo.path
}
module "sumo_child_2" {
source = "./sumo"
count = try(local.regex[2], null) == null ? 0 : 1
name = try(local.regex[2], "none")
parent_path = module.sumo_child_1.path
}
module "sumo_child_3" { # this is NOT going to be even created in our example
source = "./sumo"
count = try(local.regex[3], null) == null ? 0 : 1
name = try(local.regex[3], "none")
parent_path = module.sumo_child_2.path
}
# and so on... if there are no more folders in the path, the resources won't be created anyway.
Now let me say that again, this is a very ugly solution... but it works. Cheers.

Terraform metadata condition

I have two environments. I'm trying to write a condition in the ssh metadata block to add a ssh key depending on the environment.
For example: If env-1, add ssh1 key, if env-2 add ssh-2 key. Trying with map, but can not do this correctly. How to do it better?
metadata = {
count = var.ENV_TYPE != "ENV-1" ? 1 : 0
ssh-keys = "centos:ssh-rsa AAAAsfdsds..."
instance_role = var.GCP_CUSTOM_METADATA
app_env_monitoring = var.GCP_CUSTOM_METADATA_MONITORING
}
metadata = {
count = var.ENV_TYPE = "ENV-1" ? 1 : 0
ssh-keys = "centos:ssh-rsa BBBBsfdsds..."
instance_role = var.GCP_CUSTOM_METADATA
app_env_monitoring = var.GCP_CUSTOM_METADATA_MONITORING
}
You can probably achieve what you want by using a ternary operator:
metadata = {
ssh-keys = var.ENV_TYPE == "ENV-1" ? "centos:ssh-rsa BBBBsfdsds..." : "centos:ssh-rsa AAAAsfdsds..."
instance_role = var.GCP_CUSTOM_METADATA
app_env_monitoring = var.GCP_CUSTOM_METADATA_MONITORING
}
Additionally, I would strongly suggest moving the SSH key at least into a variable. In that case, the above code would look cleaner:
metadata = {
ssh-keys = var.ENV_TYPE == "ENV-1" ? var.ssh_key_env1 : var.ssh_key_env2
instance_role = var.GCP_CUSTOM_METADATA
app_env_monitoring = var.GCP_CUSTOM_METADATA_MONITORING
}

Terraform: Specify specific Docker Network Name for Output

I have a Containerized Network Function (CNF) that connects to three Docker Networks:
...
ip_address = "172.17.0.3"
ip_prefix_length = 16
ipc_mode = "private"
log_driver = "json-file"
log_opts = {}
logs = false
max_retry_count = 0
memory = 4096
memory_swap = -1
must_run = true
name = "c-router-52"
network_data = [
{
gateway = "172.17.0.1"
ip_address = "172.17.0.3"
ip_prefix_length = 16
network_name = "bridge"
},
{
gateway = "172.31.0.1"
ip_address = "172.31.0.4"
ip_prefix_length = 16
network_name = "inside-net"
},
{
gateway = "172.30.0.1"
ip_address = "172.30.0.3"
ip_prefix_length = 16
network_name = "outside-net"
},
]
network_mode = "default"
...
And I am trying to grab the 'outside-net' IP address for use as an input for another container. I am specifying like so:
${docker_container.c-router-52.network_data[2].ip_address}
When its the third element, it works fine.... But the problem is that Terraform (or Docker, one of the two) doesn't always put the 'outside-net' as the third network :(
Is there a way to specify the [network_name="outside-net"] rather than an index number?
Since your code example isn't complete I'm having to guess a little here, but it seems like what you want is a mapping from network name to IP address. You can derive such a data structure from your resource configuration using a for expression, which you can assign to a local value for use elsewhere in the configuration:
locals {
container_ip_addresses = {
for net in docker_container.c-router-52.network_data :
net.network_name => net.ip_address
}
}
With the above definition in your module, you can refer to local.container_ip_addresses elsewhere in your module to refer to this mapping, such as local.container_ip_addresses["outside-net"] to access the outside-net address in particular.
With the network_data structure you showed in your configuration, local.container_ip_addresses would have the following value:
{
bridge = "172.17.0.3"
inside-net = "172.31.0.4"
outside-net = "172.30.0.3"
}
If you need to access the other attributes of those network_data objects, rather than just the ip_address, you can generalize this by making the values of the mapping be the full network objects:
locals {
container_networks = {
for net in docker_container.c-router-52.network_data :
net.network_name => net
}
}
...which would then allow you to access all of the attributes via the network name keys:
local.container_networks["outside-net"].ip_address
local.container_networks["outside-net"].gateway
local.container_networks["outside-net"].ip_prefix_length

Spark lists all leaf node even in partitioned data

I have parquet data partitioned by date & hour, folder structure:
events_v3
-- event_date=2015-01-01
-- event_hour=2015-01-1
-- part10000.parquet.gz
-- event_date=2015-01-02
-- event_hour=5
-- part10000.parquet.gz
I have created a table raw_events via spark but when I try to query, it scans all the directories for footer and that slows down the initial query, even if I am querying only one day worth of data.
query:
select * from raw_events where event_date='2016-01-01'
similar problem : http://mail-archives.apache.org/mod_mbox/spark-user/201508.mbox/%3CCAAswR-7Qbd2tdLSsO76zyw9tvs-Njw2YVd36bRfCG3DKZrH0tw#mail.gmail.com%3E ( but its old)
Log:
App > 16/09/15 03:14:03 main INFO HadoopFsRelation: Listing leaf files and directories in parallel under: s3a://bucket/events_v3/
and then it spawns 350 tasks since there are 350 days worth of data.
I have disabled schemaMerge, and have also specified the schema to read as, so it can just go to the partition that I am looking at, why should it print all the leaf files ?
Listing leaf files with 2 executors take 10 minutes, and the query actual execution takes on 20 seconds
code sample:
val sparkSession = org.apache.spark.sql.SparkSession.builder.getOrCreate()
val df = sparkSession.read.option("mergeSchema","false").format("parquet").load("s3a://bucket/events_v3")
df.createOrReplaceTempView("temp_events")
sparkSession.sql(
"""
|select verb,count(*) from temp_events where event_date = "2016-01-01" group by verb
""".stripMargin).show()
As soon as spark is given a directory to read from it issues call to listLeafFiles (org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala). This in turn calls fs.listStatus which makes an api call to get list of files and directories. Now for each directory this method is called again. This hapens recursively until no directories are left. This by design works good in a HDFS system. But works bad in s3 since list file is an RPC call. S3 on other had supports get all files by prefix, which is exactly what we need.
So for example if we had above directory structure with 1 year worth of data with each directory for hour and 10 sub directory we would have , 365 * 24 * 10 = 87k api calls, this can be reduced to 138 api calls given that there are only 137000 files. Each s3 api calls return 1000 files.
Code:
org/apache/hadoop/fs/s3a/S3AFileSystem.java
public FileStatus[] listStatusRecursively(Path f) throws FileNotFoundException,
IOException {
String key = pathToKey(f);
if (LOG.isDebugEnabled()) {
LOG.debug("List status for path: " + f);
}
final List<FileStatus> result = new ArrayList<FileStatus>();
final FileStatus fileStatus = getFileStatus(f);
if (fileStatus.isDirectory()) {
if (!key.isEmpty()) {
key = key + "/";
}
ListObjectsRequest request = new ListObjectsRequest();
request.setBucketName(bucket);
request.setPrefix(key);
request.setMaxKeys(maxKeys);
if (LOG.isDebugEnabled()) {
LOG.debug("listStatus: doing listObjects for directory " + key);
}
ObjectListing objects = s3.listObjects(request);
statistics.incrementReadOps(1);
while (true) {
for (S3ObjectSummary summary : objects.getObjectSummaries()) {
Path keyPath = keyToPath(summary.getKey()).makeQualified(uri, workingDir);
// Skip over keys that are ourselves and old S3N _$folder$ files
if (keyPath.equals(f) || summary.getKey().endsWith(S3N_FOLDER_SUFFIX)) {
if (LOG.isDebugEnabled()) {
LOG.debug("Ignoring: " + keyPath);
}
continue;
}
if (objectRepresentsDirectory(summary.getKey(), summary.getSize())) {
result.add(new S3AFileStatus(true, true, keyPath));
if (LOG.isDebugEnabled()) {
LOG.debug("Adding: fd: " + keyPath);
}
} else {
result.add(new S3AFileStatus(summary.getSize(),
dateToLong(summary.getLastModified()), keyPath,
getDefaultBlockSize(f.makeQualified(uri, workingDir))));
if (LOG.isDebugEnabled()) {
LOG.debug("Adding: fi: " + keyPath);
}
}
}
for (String prefix : objects.getCommonPrefixes()) {
Path keyPath = keyToPath(prefix).makeQualified(uri, workingDir);
if (keyPath.equals(f)) {
continue;
}
result.add(new S3AFileStatus(true, false, keyPath));
if (LOG.isDebugEnabled()) {
LOG.debug("Adding: rd: " + keyPath);
}
}
if (objects.isTruncated()) {
if (LOG.isDebugEnabled()) {
LOG.debug("listStatus: list truncated - getting next batch");
}
objects = s3.listNextBatchOfObjects(objects);
statistics.incrementReadOps(1);
} else {
break;
}
}
} else {
if (LOG.isDebugEnabled()) {
LOG.debug("Adding: rd (not a dir): " + f);
}
result.add(fileStatus);
}
return result.toArray(new FileStatus[result.size()]);
}
/org/apache/spark/sql/execution/datasources/fileSourceInterfaces.scala
def listLeafFiles(fs: FileSystem, status: FileStatus, filter: PathFilter): Array[FileStatus] = {
logTrace(s"Listing ${status.getPath}")
val name = status.getPath.getName.toLowerCase
if (shouldFilterOut(name)) {
Array.empty[FileStatus]
}
else {
val statuses = {
val stats = if(fs.isInstanceOf[S3AFileSystem]){
logWarning("Using Monkey patched version of list status")
println("Using Monkey patched version of list status")
val a = fs.asInstanceOf[S3AFileSystem].listStatusRecursively(status.getPath)
a
// Array.empty[FileStatus]
}
else{
val (dirs, files) = fs.listStatus(status.getPath).partition(_.isDirectory)
files ++ dirs.flatMap(dir => listLeafFiles(fs, dir, filter))
}
if (filter != null) stats.filter(f => filter.accept(f.getPath)) else stats
}
// statuses do not have any dirs.
statuses.filterNot(status => shouldFilterOut(status.getPath.getName)).map {
case f: LocatedFileStatus => f
// NOTE:
//
// - Although S3/S3A/S3N file system can be quite slow for remote file metadata
// operations, calling `getFileBlockLocations` does no harm here since these file system
// implementations don't actually issue RPC for this method.
//
// - Here we are calling `getFileBlockLocations` in a sequential manner, but it should not
// be a big deal since we always use to `listLeafFilesInParallel` when the number of
// paths exceeds threshold.
case f => createLocatedFileStatus(f, fs.getFileBlockLocations(f, 0, f.getLen))
}
}
}
To clarify Gaurav's answer, that code snipped is from Hadoop branch-2, Probably not going to surface until Hadoop 2.9 (see HADOOP-13208); and someone needs to update Spark to use that feature (which won't harm code using HDFS, just won't show any speedup there).
One thing to consider is: what makes a good file layout for Object Stores.
Don't have deep directory trees with only a few files per directory
Do have shallow trees with many files
Consider using the first few characters of a file for the most changing value (such as day/hour), rather than the last. Why? Some object stores appear to use the leading characters for their hashing, not the trailing ones ... if you give your names more uniqueness then they get spread out over more servers, with better bandwidth/less risk of throttling.
If you are using the Hadoop 2.7 libraries, switch to s3a:// over s3n://. It's already faster, and getting better every week, at least in the ASF source tree.
Finally, Apache Hadoop, Apache Spark and related projects are all open source. Contributions are welcome. That's not just the code, it's documentation, testing, and, for this performance stuff, testing against your actual datasets. Even giving us details about what causes problems (and your dataset layouts) is interesting.

Why code running on Azure is so slow?

I have a web app running on Azure shared web site mode. A simple method where I add items to a list and sort this list, when the list size is about 300 items, takes 0.3s on my machine and 10s after deploy (on azure machine).
Does anybody has any idea why Azure is so slow?
Is any configuration I do it wrong? I use default one but replaced FREE mode with SHARED mode because I thought this would help but it seems it does not.
UPDATE:
public ActionResult GetPosts(String selectedStreams, int implicitSelectedVisualiserId, int userId)
{
DateTime begin = DateTime.UtcNow;
List<SearchQuery> selectedSearchQueries = searchQueryRepository.GetSearchQueriesOfStreamsIds(selectedStreams == String.Empty ? new List<int>() : selectedStreams.Split(',').Select(n => int.Parse(n)).ToList());
var implicitSelectedVisualiser = VisualiserModel.ToVisualiserModel(visualiserRepository.GetVisualiser(implicitSelectedVisualiserId));
var twitterSearchQueryOfImplicitSelectedVisualiser = searchQueryRepository.GetSearchQuery(implicitSelectedVisualiser.Stream.Name, Service.Twitter, userId);
var instagramSearchQueryOfImplicitSelectedVisualiser = searchQueryRepository.GetSearchQuery(implicitSelectedVisualiser.Stream.Name, Service.Instagram, userId);
var facebookSearchQueryOfImplicitSelectedVisualiser = searchQueryRepository.GetSearchQuery(implicitSelectedVisualiser.Stream.Name, Service.Facebook, userId);
var manualSearchQueryOfImplicitSelectedVisualiser = searchQueryRepository.GetSearchQuery(implicitSelectedVisualiser.Stream.Name, Service.Manual, userId);
List<SearchResultModel> approvedSearchResults = new List<SearchResultModel>();
if (twitterSearchQueryOfImplicitSelectedVisualiser != null || instagramSearchQueryOfImplicitSelectedVisualiser != null || facebookSearchQueryOfImplicitSelectedVisualiser != null
|| manualSearchQueryOfImplicitSelectedVisualiser != null)
{
// Define search text to be displayed during slideshow;
SearchModel searchModel = new SearchModel();
// Set slideshow settings from implicit selected visualiser.
ViewBag.CurrentVisualiser = implicitSelectedVisualiser;
// Load search results from selected visualisers.
foreach (SearchQuery searchQuery in selectedSearchQueries)
{
approvedSearchResults.AddRange(
SearchResultModel.ToSearchResultModel(
searchResultRepository.GetSearchResults
(searchQuery.Id,
implicitSelectedVisualiser.Language)));
// Add defined query too.
searchModel.SearchValue += " " + searchQuery.Query;
}
// Add defined query for implicit selected visualiser.
if (twitterSearchQueryOfImplicitSelectedVisualiser != null)
searchModel.SearchValue += " " + twitterSearchQueryOfImplicitSelectedVisualiser.Query;
if (instagramSearchQueryOfImplicitSelectedVisualiser != null)
searchModel.SearchValue += " " + instagramSearchQueryOfImplicitSelectedVisualiser.Query;
if (facebookSearchQueryOfImplicitSelectedVisualiser != null)
searchModel.SearchValue += " " + facebookSearchQueryOfImplicitSelectedVisualiser.Query;
ViewBag.Search = searchModel;
// Also add search results from implicit selected visualiser
if (twitterSearchQueryOfImplicitSelectedVisualiser != null)
approvedSearchResults.AddRange(SearchResultModel.ToSearchResultModel(searchResultRepository.GetSearchResults(twitterSearchQueryOfImplicitSelectedVisualiser.Id, implicitSelectedVisualiser.Language)));
if (instagramSearchQueryOfImplicitSelectedVisualiser != null)
approvedSearchResults.AddRange(SearchResultModel.ToSearchResultModel(searchResultRepository.GetSearchResults(instagramSearchQueryOfImplicitSelectedVisualiser.Id, implicitSelectedVisualiser.Language)));
if (facebookSearchQueryOfImplicitSelectedVisualiser != null)
approvedSearchResults.AddRange(SearchResultModel.ToSearchResultModel(searchResultRepository.GetSearchResults(facebookSearchQueryOfImplicitSelectedVisualiser.Id, implicitSelectedVisualiser.Language)));
if (manualSearchQueryOfImplicitSelectedVisualiser != null)
approvedSearchResults.AddRange(SearchResultModel.ToSearchResultModel(searchResultRepository.GetSearchResults(manualSearchQueryOfImplicitSelectedVisualiser.Id, implicitSelectedVisualiser.Language)));
// if user selected to show only posts from specific number of last days.
var approvedSearchResultsFilteredByDays = new List<SearchResultModel>();
if (implicitSelectedVisualiser.ShowPostsFromLastXDays != 0)
{
foreach (SearchResultModel searchResult in approvedSearchResults)
{
var postCreatedTimeWithDays = searchResult.PostCreatedTime.AddDays(implicitSelectedVisualiser.ShowPostsFromLastXDays + 1);
if (postCreatedTimeWithDays >= DateTime.Now)
approvedSearchResultsFilteredByDays.Add(searchResult);
}
}
else
{
approvedSearchResultsFilteredByDays = approvedSearchResults;
}
// Order search results (posts to be displayed by created datetime).
var approvedSearchResultsOrdered = new List<SearchResultModel>();
if (implicitSelectedVisualiser.PostsSortOrder == PostsSortOrder.CREATED_DATE_ASC)
{
approvedSearchResultsOrdered = approvedSearchResultsFilteredByDays.OrderBy(s => s.PostCreatedTime).ToList(); ;
}
else if (implicitSelectedVisualiser.PostsSortOrder == PostsSortOrder.CREATED_DATE_DESC)
{
approvedSearchResultsOrdered = approvedSearchResultsFilteredByDays.OrderByDescending(s => s.PostCreatedTime).ToList(); ;
}
else if (implicitSelectedVisualiser.PostsSortOrder == PostsSortOrder.RANDOM)
{
var rnd = new Random();
approvedSearchResultsOrdered = approvedSearchResultsFilteredByDays.OrderBy(x => rnd.Next()).ToList();
}
// Load background images;
var visualiserImages = visualiserImageRepository.GetImages(implicitSelectedVisualiser.Id);
//foreach (SearchResultModel searchResultModel in approvedSearchResultsOrdered)
//{
// searchResultModel.BackgroundImagePath = TwitterUtils.GetRandomImageBackgroundForDisplay(visualiserImages);
//}
ViewBag.BackgroundImagePath = TwitterUtils.GetRandomImageBackgroundForDisplay(visualiserImages);
approvedSearchResults = approvedSearchResultsOrdered;
}
DateTime end = DateTime.UtcNow;
Elmah.ErrorSignal.FromCurrentContext().Raise(new Exception(String.Format("User {0}: Preparing {1} posts for visualiser took {2} seconds", MySession.Current.LoggedInUserName, approvedSearchResults.Count(), (end - begin).TotalMilliseconds / 1000)));
return PartialView("_DisplayPostsNew", approvedSearchResults);
}
This isn't surprising actually. The servers used in Windows Azure are currently mostly 1.6 GHz machines. The larger sized machine you use the more cores you get, but they are all the same speed. This likely is a much slower CPU than the development machine you use.
On Windows Azure Web Sites when you move to Shared mode you are still in a multi-tenant environment, so you could be seeing some noisy neighbors here. The difference between Free and Shared is that many of the quotas for free are removed since you are paying. When you move to Standard then you are assigned a Virtual Machine dedicated to your web sites (up to 100 of them), so that is the best case scenario since you are the only one using the resources at that point.
There was a thread on this on the MSDN forums a while back : http://social.msdn.microsoft.com/Forums/windowsazure/en-US/0d0a3a88-eac4-4b9e-8b10-4a547cbf653b/performance-of-azure-servers-slow-cpus?forum=windowsazuredevelopment
They have started offering different hardware configurations with more memory for Virtual Machines and Cloud Services and such, but I'm not sure the CPUs have been changed. It's hard to find the CPU stated on WindowsAzure.com anymore, but on the pricing calculator for Web Sites it references 1.6Ghz machines when you move the slider to Standard.
Actually I found the issue.
Locally, I tested with a few hundreds of records in my DB while in Azure DB I have over 70 000 records in that table which affects performance of the algorithm...
One mistake I did in the code above: I have filtered records from DB by specific date AFTER taking all out. By filtering directly in Linq, I increased the performance from 10s to 0.3s in Azure too.

Resources