2018-11-14


Team Meeting

14 NOVEMBER 2018 / 10:30 AM / #ow2-tc on IRC, freenode

Agenda

  1. New Projects acceptance process
  2. OW2 download statistics solution: state-of-the-art, part 3;
  3. Any other business

Participants

  • Daniele Gagliardi DGA
  • Martin Hamant MHA
  • Assad Montasser AMO
  • Benoit Mortier BMO
  • Davide Zerbetto DZE

NEW PROJECTS ACCEPTANCE PROCESS

  • BMO: I have prepared a possible draft for the vote issue https://quotidien.framapad.org/p/vaDI7u5vr4

  • DGA: it looks good. I would specify better the part "if objections are not answered satisfyingly...": I think we should define the vote suspension duration, and clarify how TC would accept or not the answers
    so, we can say: the vote is stopped up to a week starting from now. Then we should also define a channel where questions related to the veto and answers will be posted: the submitter couldn't be in the TC mailing list
    so: how can we send him questions, how he should send back the answers
    what do you think?

  • BMO: we should have a vote email address maybe or better a mailing list for this kind of case

  • DZE: what about using JIRA for this process?

  • BMO: a dedicated mailing list is better and more transparent

  • DGA: Jira is going to be decommissioned, in favor of Gitlab issue tracker

  • DZE: ok, we can use Gitlab issues instead of JIRA, of course I'm talking about a public issue, so, in my opionion, it will be transparent and visible enough, but, if you prefer having a mailing list, it is also fine with me

  • MHA: jira is offline since several months

  • DGA: well, it would be transparent and visible for sure, don't know if we really need to make the world aware about a veto on a project submission. Maybe a simple email exchange would be enough: we can also simply start an email exchange between the CTO and the submitter, adding the TC in copy,CTO and or TC chairman (the potential CTO)

  • BMO: we should not restrict that to cto and tc chairman

  • DGA: the whole tc should be included in this: CTO, TC chairman and the TC mailing list

  • BMO: the tc should be like the technical commity of Debian

  • DGA:, I didn't understand the question related to the quality. Could you explain?

  • BMO: for me if you decline an applicant it is also a good measure how carefull you are toward the ow2 project as a whole, so rejecting someone that doesn't fit is a quality emoticon_tongue

  • MHA: "Criteria for rejection are" ... the submission should be complete, that's all, all fields should be filled

  • BMO: yes but in the case of veto we should have some criteria more filling boxes doesnt tell me its a good project for ow2

  • MHA: well the boxes should be filled propertly, of course

  • BMO: for exemple that the case of the vcs repository or communication channel or size of the community

  • MHA: in the submission process all boxes should be filled properly, and the TC is in charge to look at those.
    I like the sentence you added. when the pad's done we should update the TC wiki

  • BMO: yes its better that way give some guidelines

  • DGA: well, I have some concern about the quality: we should define precisely the minimum quality criteria that should be met.code quality: which SonarQube profile? I'm thinking about quality metrics (i.e. compliance to SonarQube quality gates)

  • BMO: the criterias listed in the pad are Quality based no ? does it fix with the overall ow2 goal is it usefull to the other project inside ow2
    does it give an edge to the ow2 in term of communication? thats the first that pop in my mind

  • BMO: quality of code is an entirely other story and i will no go that way because its very difficult emoticon_smile

  • MHA: the mission of OW2 is among others things to help projects enhance their overall quality, so to me Sonarqube is something we add afterhand

  • DGA: what about projects which don't have a repo yet, but they have already the code in some internal repo?

  • BMO: enhance the code quality could be one of the benefit of being an ow2 member, we could teach each other or make some kind of seminar to help projects. For internal repo its a veto for me, because we know they have to do the effort first

  • MHA: to me the source code should be public, as we asked to SPECTRE

  • DGA: ok, but in that case we opened a space within the OW2 infra before the project have been accepted

  • MHA: yes that's a good question.  that's a pretty isolated case to me. SPECTRE is the first applicant project that fall into this case

  • DGA: I think it's quite fair from OW2 to support projects offering them the opportunity to have a repo within the OW2 infra

  • BMO: yes but if we accept that we break the rules of submission

  • MHA: we can open a repo and close it if the project is rejected, I mean it's a detail

  • BMO: i would propose we have a space in the gitlab with temporary repository in case the code is internal, those repository will be erased if not accepted, but then again if the code is internal you don't have a website so it fall appart everywhere

  • MHA: those are very isolated cases, but a dedicated space is a great idea, but again it happens once every 5 years emoticon_smile

  • MHA: who's in charge to update the TC wiki against the pad ? is everything clear ?

  • DGA: for me yes

  • BMO: me too emoticon_smile

  • DGA: should we ask a vote about these changes or not? can we proceed directly updating the submission process?

  • MHA: wait, what about a missing web site then ?

  • BMO: dangagliar: i think there are a needed fix

  • MHA: we need to define the actions to do when a criteria is not met. especially those who are not straighforward

  • BMO: The submission should be complete with specific attention to : it said here

  • MHA: yes but what do we answer to the submitter ?

  • BMO: website is base for me emoticon_smile

  • MHA: "no visible community" and what to do if nothing can be provided we should be crystal clear

  • BMO: then we reject, if no community its like dead code for me

  • BMO: ow2 is a forge of project that have a future, you need at least an internal community and x other people using your code

  • MHA: I fully agree, that's not my point, I'm asking more about the process it terms of communication, wording etc like explaining why it's rejected, what to do next

  • BMO: we can make template of answer that very easy so its always communicated the same way
    No Website => Action template

  • MHA: the project is rejected for the following reasons {}
        what you can do about it {}
        What we have evaluated {}

  • BMO: yes if you want to resubmit please provide us with the xxx

  • MHA: so we need to complete the pad emoticon_smile can we do this offline ? I mean after the meeting that's not only words, we all need to take part of itemoticon_smile

  • BMO: dangagliar added the two possibilty to the pad, its a pad for one day but i will transfer this to a montly pad emoticon_smile  https://mensuel.framapad.org/p/zyjduppjug here is the montly pad

  • DGA: action point: https://gitlab.ow2.org/ow2/technology-council/issues/7

     

DOWNLOAD STATS

  • MHA: I wrote https://tc.ow2.org/view/wiki/Download-Statistics-Part3. I started configured matomo.ow2.org , injecting download stats. I have a few issues to look into , I have imported all 2018 logs from legacy downloads, then I tried to request Live.getLastVisitsDetails for Jan 2018 and the size of the json result is very large emoticon_smile So we might need to figure out if we need to request days instead of month. The case I'm speaking is "approach 2". for approach 1, we don't have to this problem
  • DGA: well, would it be possible to monitor Matomo while it's responding to a request? to evaluate the impact and decide which is the best approach. if we see that approach 2 has a low impact, approach 2 wins. the same holds for approach 1
  • MHA: well that's easy, it depends on the number of visits we have for a given month. For jan 2018 memory_limit set to 512M is not enough: that's my initial evaluation :=) but we can have approach 2 based on days rather than months, same for approach 1 btw, as in theory, we have already stored the previous days in the CSV files. My concern is then : what to do if it happens to get holes, for X reasons, like matomo not online, how do we fill the gaps. on which criteria ? what is a "hole" ?
  • DGA: well, data are always available in the logs: would it be possible to define special jobs that run just in case of a hole? Of course Matomo should run to process the logs, then the jobs should ask Matomo
  • MHA: Then this cases should be considered as real cases now, not ignoring it, it should be prepared in the design, so it is easy to fill gaps when it happens
  • DGA: anyway, how does Matomo deal with its offline time slot? does it process the logs starting to its last processing?
  • MHA: log analyzer is another thing, matomo's log analyzer is not capable to see if one log entry has already been integrated in its database. See my notice "we should avoid running the analyzer twice on the same log file at all cost"
  • MHA: when I meant "offline" before, I mean, the API is offline, so by means, matomo's output, but it doesn't mean the database is unconsistent
  • DGA: I was considering that very aspect about double processing, ok, so let's focus on API offline
  • MHA: well, do we report 0 or nan in the CSV ?
  • DGA: if the API are offline, yes it's a Nan, but then in the post-processing phase maybe we should try to get missing data.
  • MHA: I guess post-processing would run once a month: the log are from 6 to 6 the next day. it's not directly related to Nans, but more in the log analysis phase. The question I'm working on is how to detect if a log file wasn't or only partially processed for some reason: my initial direction is to do an API request before starting the log analyzer
  • DGA: let me get this: requests to Matomo get download stats until 6 am of the current day, starting from 6am of the previous day: so if the request fails due to API offline, we'll have a Nan for that time slot
  • MHA: logs = raw webserver log files, 6 to 6, each, but when you request the API , you get a day from 0 to 0 of course
  • DGA: but when do the log analyzer run? Once a day?
  • MHA: yes, after 6 I guess. today it doesn't run in a cron. I have to implement this, and make sure it doesn't miss anything
  • DGA: ok, so suppose log anlyzer run at 6 am of today, and it analyzes download stats from 6 am of yesterday to 6 am of today. Then the download stat job would run every day at, suppose 6pm, and it will get all info from 0 to 0 of previous day.
  • MHA: But if the log analyzer fails, for some reasons, we are in trouble we need to know when it failed and on what, making sure we're not running the analyzer of log entries that are already in matomo's database
  • DGA: ok, I think we should look for some best practices on it
  • MHA: to me that's a feature that should be part of matomo in the first place, so we are working around here. if it happens, we would have to look at matomo by hand an retrigger the  analysis of the specific log files

Next meeting

  • Tue, 11 Dec. 10,00-11,00 CET