If you aren’t familiar with Owncloud, it is a very cool open-source software package that runs on Linux Apache (or Nginx) that provides “dropbox like” functionality that you can host yourself.

This is a big deal for the tech-savvy average-Joe that is worried about keeping private data private (i.e. he doesn’t want all of his personal documents stored by Microsoft, or Dropbox, or Google, etc.) but still wants the “cloud-like” functionality of being able to securely access and sync files across multiple devices.

It is also a big deal for any enterprise that wants to use “cloud storage” but has to worry about all of the above due to data security requirements. It is self-hosted, so you know exactly where all of the data is and you have control over the security components protecting it. Citrix, Dropbox, and others have realized a growing need for this and have “enterprise” products that are in the same vein. They just cost a good bit of money, don’t always meet all of the stringent security requirements imposed on some types of data, and tend to be complex/cumbersome systems.

Owncloud also has an enterprise version of their software offering which runs upwards of $10k/year. When I did a comparison of the “enterprise” vs. “open-source” the only value I could see in going enterprise was support, and one additional module that does granular file-activity-logging (i.e. user jdoe, shared this file, on this date). Obviously support is support, you aren’t going to get enterprise support without paying an enterprise price. Writing that off, that just leaves the enhanced logging.

I don’t have the requisite skill-set to build my own logging module. But Owncloud is ultimately just a web application running on Apache, so why not track it like we would any other web application? Namely, using a site analytics tool and the Apache access log.

Enter Piwik… Piwik is like a “self-hosted” Google Analytics that is a more of a pain to setup and doesn’t have as many bells and whistles. But it is self-hosted, so your Analytics data is private. It is also open-source, so it goes with our ongoing theme of “free”.

This is by-no-means going to be a comprehensive setup guide for Piwik and Owncloud. Rather I am going to put some answers out there I came up with that as far as I could tell weren’t dealt with by anyone else.

Once you get your server initially configured, install owncloud in a webfolder and setup a virtual-host file for it. For example, here is a common place you might deploy it:

/var/www/owncloud

Now, when you go to install Piwik, if it is solely being installed to track Owncloud, then you should place it in a sub-folder off of your owncloud www directory. So you could unzip the Piwik application to somewhere like this:

/var/www/owncloud/analytics

This is going to make your life A LOT easier because you aren’t going to have to worry quite as much about something called the “Content Security Policy” that the Owncloud application takes advantage of. This is especially important for OwnCloud version 8.0.1+ because in the olden days there was an easy way to bypass the CSP, in newer versions there is not…

Okay, after you initially install Piwik, it is going to hand you some Javascript and tell you to place it everywhere you want to monitor stuff in your web-application. Just ignore all of that for now and let me tell you why… it has to do with new security protocols that won’t allow in-line java-script in a php file to be executed on the spot. So you will place your tracking code, but most (all?) browsers will no longer execute any of the script. So no tracking. To get around this, you have to put all of your script into a .js file and load that file, rather than loading the script. Here is how I did this for me…

In your piwik root directory, create file called “piwik-tracker.html” with the following contents:

<script src="/analytics/tracker.js"></script>

In your piwik root directory, create a second file called “tracker.js” with the following contents:

var _paq = _paq || [];
  _paq.push(['trackPageView']);
  _paq.push(['enableLinkTracking']);
  (function() {
    var useridname = document.getElementsByTagName("head")[0].getAttribute("data-user");
    var u="//www.yoursiteurl.com/analytics/";
    _paq.push(['setTrackerUrl', u+'piwik.php']);
    _paq.push(['getVisitorInfo',]);
    _paq.push(['setSiteId', 1]);
    _paq.push(['setUserId', useridname]);
    _paq.push(['getUserId']);
    _paq.push(['setDownloadClasses',['action-download']]);
    _paq.push(['discardHashTag', 'FALSE']);
    var d=document, g=d.createElement('script'), s=d.getElementsByTagName('script')[0]; g.type='text/javascript'; g.async=true; g.defer=true; g.src=u+'piwik.js'; s.parentNode.insertBefore(g,s);
  })();

Note, you need to update “var u” to reflect your site info.

Note, “var useridname” – that is the piece of code that was lacking from every other owncloud/piwik integration discussion. Other users used inline php to pull the username, but when the tracking code is moved into a separate js file, rather than being inline in an existing PHP document, the system no longer parses the included php. After some digging and testing (as I know very little about coding Java…) I came up with the native Javascript I could use to pull the element from the HTML that contained the logged-in user’s name. If you don’t get this working, your Piwik logs will not have any usernames and you miss out on a lot as a result.

Okay, so you created those two files. Now, we need to actually reference them in the appropriate places in our owncloud installation for tracking to actually work.

You need to add the following line:

<?php include '/var/www/owncloud/analytics/piwik-tracker.html'; ?>

in-between the

<body></body>

tags in the following files:

/var/www/owncloud/core/templates/layout.user.php
/var/www/owncloud/core/templates/layout.guest.php
/var/www/owncloud/apps/files_sharing/templates/public.php

Once you do that, refresh your owncloud page a few times and then check Piwik. Data should be flowing in, and it should show the name of the logged-in user generating it. Sweet!

— Apache Access Log —

One of the frustrating items I couldn’t figure out was getting Piwik to track file shares and downloads. Owncloud uses Ajax (forgive me if I make a hash of all this as I am not a coder) quite heavily. It uses Ajax to generate contextual download links for files on the fly. I.e. you hover your mouse over “download file” and the associated link is not

https://mycoolowncloudsite.com/file/private-doc.docx

If it were, I could figure out how to track that stuff in a heartbeat. Rather when you hover over a download link in owncloud, they ALL look like this:

https:///mycoolowncloudsite.com/index.php/apps/files?dir=%2FDocuments#

You get the directory name, but not the file name. If someone has an answer for how to modify the piwik tracking code to catch the file-name, I am all ears! However I spent many long hours on it and never cracked it. So…

I turned to the Apache Access Log… First, in your virtual host file you probably need to clean-up the logging a bit. Because we are hosting Piwik in a sub-directory, your Apache access log file is going to contain logs for both Piwik and for Owncloud. If you want to turn off logging of everything that happens in Piwik (i.e. everything a user logged into Piwik does), your Virtual host file should be configured for logging with something like this:

 SetEnvIf Request_URI ^/analytics(/|$) analytics
 ErrorLog ${APACHE_LOG_DIR}/owncloud-error.log
 CustomLog ${APACHE_LOG_DIR}/owncloud-access.log combined env=!analytics

This basically tells Apache a couple of things. First, I want my own error and access log that pertains to just this virtual host. Second, I don’t want to monitor any web-traffic in the “analytics” subfolder in my access log.

With that now done, your Access Log should be significantly less busy. If you are familiar with logging and log monitoring at all, you realize that getting rid of the “noise” in logs is like 90% of the game.

Okay, you can now “do stuff” on your owncloud site, and then tail your access log and you can see that Apache catches pretty much everything.

This is where I got lazy myself. Rather than write some nice scripts that will run and create reports regularly (I started on this path at first and will share what I came up with but this isn’t where I ended in all fairness), I just shipped our access log off to a SEIM server that parses and analyzes well enough for me. If you want to take this approach (which I kind of like because it also means there is an “off server” backup copy of all of my logs), there are some open-source SEIM options you could look into like Graylog2, AlienVault, and Syslog-NG to name a few… It is WAY out of the scope of this article for me to get into that…

If you want to keep it simple, and are okay with your logs being kept on server and just want an easy way to dig through them from time to time, I suggest “grep” and “awk”. Here are some one-liners I wrote to quickly get me some info. (writing them wasn’t quick as I had never used AWK before… however it is kind of amazing…)

#File-Downloads from Browser
cat /var/log/apache2/owncloud-access.log | grep ajax/download.php | awk '{print $1,$2,$3,$4,$5,$6,$7,$10,$12}'

#Most Downloaded Files:
cat /var/log/apache2/owncloud-access.log | grep ajax/download.php | awk '{print $7}' |  sort -n | uniq -c | sort -rn | head

Those are based on people interacting with the web application. I had just started working on mobile-client stuff. Ultimately I wanted to take it all, throw it into a script that could be run with some date/time parameters that would spit out a report (either a txt document or a CSV). Like I said though, got lazy and just sent it to SEIM :).

1 of 1

4 comments on: Owncloud + Piwik + Apache Access Log – Monitored Self-Hosted Enterprise File Sharing

  1. Konstantinos Dimkas
    Reply

    Hi, congrats for the explanation. Probably the best online!
    But due to the fact that i am new to administration, how do i check out piwik?
    Am i supposed to create a site or something?

    Thanks in advance

    • nbeam
      Reply

      Are you familiar with manually installing a CMS like WordPress or Drupal onto an Apache/MySQL server?

      If not, that would be the first place to start. Piwik is installed very similarly to how you would install WordPress or Drupal.

      Basically you create a folder, setup an apache virtual host entry/file for it, enable it, extract the web files into the folder.
      Then, before you go any further, create a database in MySQL with a new user and note db name, username, password.

      Then visit the page in a browser that you server should now be hosting and go through the install steps. Part of the install will ask you for your database credentials and DB name for the DB you just setup.

      Then, all done.

      Pretty much the majority of modern web PHP drive web applications install something like that. Piwik is no different. In this article, the only difference is I advocate for installing it in a sub-folder off of your owncloud root folder instead of giving it its own URL/virtual host/directory etc.

      Hope this makes some sense. Good luck! Linux admin’ing has a bit of a steep learning curve initially but the effort is well worth it and rewarded as there are so many cool and fun things you can do once you get a handle on it.

      -Nathan

  2. Konstantinos Dimkas
    Reply

    Thanks for the reply!
    If i have an existing pewit installation is there a way that i can add owncloud to it?
    (maybe by using the files mentioned above)

  3. Lars_M
    Reply

    hello,

    thanks for the howto. With nextcloud i have at the /var/www/owncloud/apps/files_sharing/templates/public.php no . on which position i must insert the code?

    Look here: https://github.com/nextcloud/server/blob/master/apps/files_sharing/templates/public.php

Join the discussion

Your email address will not be published. Required fields are marked *