@dac_dev Twitter account

July 8, 2017, 4:34 am

≫ Next: PHP: A function that returns two different things

≪ Previous: PHPUnit: using a mocked-method call log to track its activity

G'day:
Yes, this thing is still alive.

This morning I indicated on Twitter I was discontinuing my usage of the @dac_dev Twitter account. I'll just be using my @adam_cameron one from now on.

@NXRK made this observation just now:

half expected that you'd rename the two to swap names and keep the historical content/followers on this one
— rmcdowall (@NXRK) July 8, 2017

Just to be clear: my twitter account is the @adam_cameron one. The @dac_dev one was created specifically for promoting this blog, and the "brand" that was my existence in the dev world, such as it is. Specifically the CFML dev world.

I don't want to promote that brand any more, or at least not such that I want to keep it separate from my "real" self. Or more that there's simply not enough usage of that account to warrant me having to keep it separate from my own account.

I won't be deleting the @dac_dev one, and I have set it to send me emails if anyone hits me up there, but I am not paying attention to it any more, nor do I intend to post anything to it.

As with anyone's account on Twitter, you are free to keep an eye on my mumblings on the @adam_cameron one. I'm less polite there (if that can be believed...), and it's definitely more "general" commentary. Rugby, cricket, specious social commentary, naive political observations and photos of my meals. Well: not the latter. I'm not a complete twat(*).

In a closely-related moved, I've chucked-in my participation on the CFML Slack channel too. I was not getting anything positive out of my membership there, and she shitness of some elements of the CFML community was doing my head in (if yer reading this, I'm almost certainly not including you in that shit demographic).

Righto.

--
Adam

(*) based on just that metric, I mean.

↧

PHP: A function that returns two different things

July 23, 2017, 10:02 am

≫ Next: Yeah, you do want 100% test coverage

≪ Previous: @dac_dev Twitter account

G'day:
I'm gonna plagiarise one of me own answers to a Stack Overflow question here, as I think it's good generic advice.

The question was "PHP 7: Multiple function return types". The subject line sums it up really. The person here wants to return a value or false from a function. I see this a lot in PHP code. Indeed... PHP itself does it a lot.

My answer was as follows:

[...]

From a design perspective, having a function that potentially returns different types of result indicates a potential design flaw:

If you're returning your result, or otherwise false if something didn't go to plan; you should probably be throwing an exception instead. If the function processing didn't go according to plan; that's an exceptional situation: leverage the fact. I know PHP itself has a habit of returning false if things didn't work, but that's just indicative of poor design - for the same reason - in PHP.
If your function returns potentially different things, then it's quite possible it's doing more than one thing, which is bad design in your function. Functions should do one thing. If your function has an if/else with the true/false block handling different chunks of processing (as opposed to just handling exit situations), then this probably indicates you ought to have two functions, not one. Leave it to the calling code to decide which is the one to use.
If your function returns two different object types which then can be used in a similar fashion in the calling code (ie: there's no if this: do that; else do this other thing in the calling code), this could indicate you ought to be returning an interface, not a concrete implementation.
There will possibly be legit situations where returning different types is actually the best thing to do. If so: all good. But that's where the benefit of using a loosely typed language comes in: just don't specify the return type of the function. But… that said… the situation probably isn't legit, so work through the preceding design considerations first to determine if you really do have one of these real edge cases where returning different types is warranted.

I'll add two more considerations here.

There's another option which is kinda legit. In PHP 7.1 one can declare the return type to be nullable.

Consider a function which potentially returns a boolean:

function f($x) {
    if ($x) {
        return true;
    }
}

We can't declare that with a return type of bool as if our logic strays done that falsey branch, we get a fatal error:

function f($x) : bool {
    if ($x) {
        return true;
    }
}

f(1); // OK

f(0); // PHP Fatal error:  Uncaught TypeError: Return value of f() must be of the type boolean, none returned

But in 7.1, one can declare the return type as nullable:

function g($x) : ?bool {
    if ($x) {
        echo "returning true";
        return true;
    }

echo "returning null";
return null;
}

g(1); // returning true
g(0); // returning null

Now sometimes this is legit. One might be looking for something from some other resource (eg: a DB record) which quite legitimately might not exist. And there might not be a legit "null implementation" of the given class one is wanting to return: for example returning the object with some/all of its properties not set or something. In this case a null is OK. But remember in this case one is forcing more logic back on the calling code: checking for the null. So not necessarily a good approach. I'd consider whether this situation is better handled with an exception.

The last consideration is that one can still specify multiple return types in a type-hint annotation, eg:

/**
 * @param bool $b
 * @return A|B
 */
function f($b){
    if ($b) {
        return new A();
    }else {
        return new B();
    }
}

This doesn't actually enforce anything, so it's mostly a lie, just like all comments are, but at least one's IDE might be able to make sense of it, and give code assistance and autocomplete suggestions when writing one's code:

See how it offers both the method from an A and a B there.

Still: I really dislike these annotations as they're really just comments, and especially in this case they're lying: f can actually return anything it likes.

Right... must dash. I am supposed to be down @ my ex's place hanging out with my son in 10min. I'll press "send" and proof-read this later ;-)

Righto.

--
Adam

↧

Yeah, you do want 100% test coverage

July 31, 2017, 12:59 am

≫ Next: Mishmash: code review as a learning exercise, loops vs higher-order-functions and testing

≪ Previous: PHP: A function that returns two different things

G'day:
I've read a number of blog articles recently that have been eshewing 100% test coverage. The rationale being that there's little difference between 90% and 100%, and that that last 10% can be quite tricky to attain as - we being human 'n' all - sometimes leave the worst to last. Obviosuly we should all be doing TDD so that situation would not arise, but - again - we're human so don't always do TDD for one reason or another ("can't be arsed" is the usual one).

Another rationale is that it's not a race to 100%. IE: the object of the exercise is delivering working sortware which adds business value; it's not "getting 100% test coverage".

The latter is true. But we still aim for 100% test coverage for basically one good reason: visibility.

First, look at this:

Not bad I guess. Not great.

Now look at this:

We've accidentally forgotten to cover something new. Can you see it (it's one of the provider methods).

On the other hand look at these two:

Somewhat easier to spot, yeah?

This is bloody handy. In this case I just commented-out one of my tests to get the report off 100%, but in the real world I'd see that and go "huh, WTF?", and perhaps discover I had a logic branch untested, or a test I thought was covering something wasn't actually doing so. Or I just forgot to put the @covers annotation on the test. Whatever the situation: it's handy for the shortfall to be so obvious.

What it also means is part of our pull-request process is running a Bamboo build which runs all our tests (plus some other code quality tools):

Everything's green here, so it's OK for code review. If it was failing, I'd get a nudge from my reviewers going "erm... you sure this is ready to go?" And even if I thought it was, we can't merge anything whilst the build is failing anyhow. The issue needs to be addressed.

Now the key thing here is this:

We don't test everything, but we cover everything. Where "cover" might be "we don't need to test this". In this case a single variable assignment does not need testing. We trust PHP to follow simple instructions. The thing is: this still shows up as green in the test coverage report.

We do this in a few situations:

the method has not branching logic and is overwise pretty simple. Ignore it.
we're backfilling coverage on old code we need to maintain, but it's such a mess we simply don't have time to solve all the testing shortfalls right now, so we test what is urgent (the bit we're changing), but just mark as ignore the rest of it. Now that file is green, and we can see if it stops being green.
similar to the above but we know the code the code in question is not long for this world. So we ignore it and stick a TODO cross referencing to the ticket that will be dealing with it. And the ticket must already be lined-up to be worked on "soon".

There might be other situations where we need to be pragmatic and skip some stuff, but this needs to be explained to the code reviewers, and get their agreement.

But being able to run a report that says "100%", and shows up anything that isn't 100% is gold.

Get yer coverage report green, but don't be afraid to be pragmatic and skip stuff, if there's an informed reason to do so.

Now this isn't to say one can just pepper ignore annotations around the place just to get the report green. That's daft. The object of the exercise is not to make some report green / say 100%. It's to get good code quality. This is just a tool to help identify the quality: it's still up to the devs to provide the quality. Don't forget that. It needs to be an informed, honest decision for one to ignore code.

Righto.

--
Adam

↧

Mishmash: code review as a learning exercise, loops vs higher-order-functions and testing

August 19, 2017, 4:06 am

≫ Next: PHP: misuse of nullable arguments

≪ Previous: Yeah, you do want 100% test coverage

G'day:
There's a few things going on in my head here, and none individually are worthy of an article, but perhaps if I dwell on a bunch of stuff something might shake out. I have no idea as I have not drafted this one in my head like I usually do, I'm just "let's start typing and see if I get anywhere". Eek.

OK so yesterday I was code reviewing one of my mates' work, and breezed past this code (this is not the exact code, but it's a replication of it):

public function getSomeThings($rowsToGet)
{
    $thingsValues = $this->dao->getThingsFromStorage($rowsToGet);

    $things = [];
    foreach ($thingsValues as $thingValues) {
        $things[] = $this->thingFactory->createFromRawData($thingValues);
    }

    return $things;
}

I just put a note in the pull request "array_map?".

Now I do a few things in a code review. Firstly I check for obvious flaws, unclean code, bad test coverage, typos, etc, and I will raise those as "tasks" in BitBucket: stuff that needs to be addressed before I'll press the "Approved" button. This is our standard approach to code review, everyone expects it, and it's fine. It works.

But secondly I'll just make observations about the code, ask questions, suggest alternative approaches we could have taken, etc. Just as food for thought, a discussion point, or a chance for some learning. This was the latter. I was not suggesting to my mate they needed to change their code to match my whim... the code is fine as it is. It was just "did you think of this?".

An interesting conversation ensued, which I will paraphrase here. Note that Fred and I do not work in the same physical office.

Fred (his name's not Fred): I'm not sure the benefit of swapping simple code for a more complex solution, unless there's some performance gain I'm unaware of.

Adam (my name is Adam): Hmmm... well array_map is one expression, and it reduces the complexity here to a single variable assignment statement(*). The loop version requires building the array by hand, and requires two assignments and the whole loop thing too. So I think the array_map approach is less complex innit?

return array_map(function ($thingValues) {
    return $this->thingFactory->createFromRawData($thingValues);
}, $thingsValues);

(*) OK it's two statements. Never let it be said I don't hyperbolise to better make my point.

One thing I did not get around to saying is that performance is not a consideration here. One of the calls is hitting an external DB, so any processing time differences are going to be eclipsed - by orders of magnitude - by the external call. And, yes, I know that if we'd be inclined to worry about micro-optimisations, then calling a higher-order function is slower than a loop. One should simply not care.

Adam (cont'ed): further more the array_map version only requires a single test, whereas the loop version needs three. That in itself demonstrates the increased complexity.

Fred: err... why's that then?

I'll break away from the conversation here.

The dev process here is that we identify the need to have a function to get stuff from the DB, and then model that and return it to the calling code. So doing TDD we write a test which uses a mocked-out DAO to return known data, and we test that the values in the resultant models are what we expected them to be based on the mocked DAO data.

Here's a very simplified, self-contained example:

First our test:

describe('comparing testing for foreach vs array_map', function() {
    given('sut', function () {
        return new ForeachVsArrayMap();
    });
    given('allObjects', function () {
        return [
            (object) ['id' => 1, 'mi' => 'tahi'],
            (object) ['id' => 2, 'mi' => 'rua'],
            (object) ['id' => 3, 'mi' => 'toru'],
        ];
    });
    describe('tests of foreach', function() {
        it('returns the numbers as objects', function () {
            $result = $this->sut->getObjects();
            expect($result)->toEqual($this->allObjects);
        });
    });
});

And now our implementation:

class ForeachVsArrayMap
{
    private $allData;

    public function __construct()
    {
        $this->allData = [
            ['id' => 1, 'mi' => 'tahi'],
            ['id' => 2, 'mi' => 'rua'],
            ['id' => 3, 'mi' => 'toru']
        ];
    }

    private function getNumbers($rows)
    {
        return array_slice($this->allData, 0, $rows);
    }

    public function getObjects($rows = 3)
    {
        $data = $this->getNumbers($rows);
        $numbers = [];
        foreach ($data as $row) {
            $numbers[] = (object) $row;
        }

        return $numbers;
    }
}

And that passes. That's our TDD side of things done.

However we also now have to do edge testing as well. We've got a loop in there, and all loops need tests for three things: zero iterations, one iteration, and more than one iterations. There are easily introduced logic errors around the plurality of loop iteration.

Here our TDD test has coverd the "more than one iterations" case, so we only need the other two:

it('handles one result', function () {
    $result = $this->sut->getObjectsUsingForeach(1);
    $expected = [$this->allObjects[0]];

    expect($result)->toEqual($expected);
});

it('handles zero results', function () {
    $result = $this->sut->getObjectsUsingForeach(0);
    $expected = [];

    expect($result)->toEqual($expected);
});

Pretty simple tests, but still: two additional tests.

Why don't we need these two tests when using array_map? Because array_map's iteration is all handled within its implementation. It's PHP's job to test that, not ours. So all we need to test is that the inline callback works. And we only need to test that once: when it's called, it does what we expect it to do. We can safely stipulate that array_map will always return an array with the same number of elements as the input value. PHP deals with that; we don't need to.

One good thing about TDD (and test coverage in general) is now we can safely swap in the array_map version, and the test coverage still passes, so we know - with a high degree of certainty - that our code is A-OK.

public function getObjects($rows = 3)
{
    return array_map(function ($row) {
        return (object) $row;
    }, $this->getNumbers($rows));
}

Coincidentally I had another conversation, in another code review, on Friday about the necessity of always testing loops for 0, 1, >1 iterations. "One can just look at the code and know it's fine". This statement didn't come up in the conversation, but I've heard it before.

Three weeks ago I was show-stopped by a bug in one of our libraries. Taking the analogy above, this was the code:

public function getObjectsUsingForeach($rows = 3)
{
    $data = $this->getNumbers($rows);

    foreach ($data as $row) {
        $numbers[] = (object) $row;
    }

    // do something else with $numbers here
    // ...
}

The code had no tests. Can you see what's wrong? The general expectation when calling a function to get data is that there's actually data to get, given the provided criteria. In the case I fell foul of, the value of $rows was zero. And this was legit. So if $rows is 0, what value does $numbers have after the loop? Well: $numbers was undefined.

Had the dev done their testing properly, then they'd've noticed their bug. They never initialised $numbers outside of the code in the loop, so there's a chance it might never get defined. In this case not only did the dev not do the edge-case testing, they didn't even do the baseline TDD testing either, but that's a different story.

I needed to down-tools and fix the library before I could continue my own work. And it was made very tricky by the fact the code in the library had no tests at all, and wasn't written in a clean-code fashion (so was basically untestable). But we are not allowed to work that way any more, so I needed to do a bunch of work the original developer ought to have done.

I have also fell foul of this brainfart too:

public function getObjectsUsingForeach($rows = 3)
{
    $data = $this->getNumbers($rows);

    foreach ($data as $row) {
        $numbers = [];
        $numbers[] = (object) $row;
    }

    // do something else with $numbers here
    // ...
}

This usually crops up when code that was previously used once suddenly needs to be used for multiple iterations. Note that the initialisation of $numbers is inisde the loop, so it will re-initialise every iteration. This would be caught by the "more than one iterations" test. This is rarer, and usually only crops up in completely untested code: it'd be a shit tester who tested a loop with only one iteration as the test, but I've seen it done. Saves time not having to mock so much data, you see? Berks.

So, anyhow... always test yer loops. And try to avoid loops if there are built-in functions that handle the iteration for you.

Also always do TDD, but also understand that there is more testing to do after the TDD cycle. TDD is not the end of the story there.

In the title of this article I mention "code review as a learning exercise". I greatly summarised the back and forth between Fred and myself here... there was an order of magnitude more discussion than there was code under discussion. this sort of interaction is gold. Some people get the idea that code review is kinda like a teacher marking homework. It's partly that, but that's only part. As programmers and coders our job is one of communication: communication of instructions to a computer, both to the computer itself, but also to our future colleagues and selves about the nature of the instructions. Programming languages are a means to communicate instructions to other humans. It's tangential the computer can also be made to understand them. Similarly reviewing each other's code should be a communications and collaboration process, and we absolutely should be having these discussions. Both Fred and I benefited from this conversation: a lot of what I typed here wasn't formalised in my brain before explaning it, but now I have a much clearly picture of what I do and why. And Fred knows too. This was not a waste of time, as we're both now better developers for it. And it might have seemed like a lot of back and forth in the code review, but it really only represents about 10min of discussion. And all this came from just a conversation point, not me saying "change the code".

If we were in the same office, we might have had the conversation via voice instead of typing in the code review, and some people might think that's better. In a way it is, but it also means that the details of the conversation don't get recorded, so benefit fewer people. That said, when programmers are not agreeing on things, often it is better to switch to voice instead of continue the discussion via text-based back-and-forth. It's just easier and more collaborative sometimes.

Right, that's that. A bit random, but didn't turn out so bad I think. And I seem to have found a point to make.

Righto.

--
Adam

↧

PHP: misuse of nullable arguments

November 22, 2017, 4:06 am

≫ Next: PHP: array_map has a dumb implementation

≪ Previous: Mishmash: code review as a learning exercise, loops vs higher-order-functions and testing

G'day:
PHP 7.1 added nullable types to the PHP language. Here's the RFC. In short, one can specify that a return type or an argument type will accept null instead of an object (or primitive) of the specified type. This code demonstrates:

function f(int $x) : int {
    return $x;
}

try {
    $x = null;
    $y = f($x);
} catch (\TypeError $e) {
    echo $e->getMessage();
}

echo PHP_EOL . PHP_EOL;

function g(?int $x) : ?int {
    return $x;
}

$x = null;
$y = g($x);
var_dump($y);

Ouptut:

Argument 1 passed to f() must be of the type integer, null given, called in [...][...] on line 8

NULL

Why do we want this? Well if we look at a statically-typed language, it makes a bit of sense:

 public class NullableTest {

     public static void main(String[] args){
         Integer x = null;
         Integer y = f(x);
         System.out.println(y);
     }

     private static Integer f(Integer x){
         return x;
     }
 }

So x is an Integer... it's just not been given a value yet, but it'll pass the type-check on the method signature. (This just outputs null, btw).

I think the merits of this functionality in PHP is limited when it comes to an argument type. If we had that equivalent code in PHP, it's not that x is an int that just happens to not have been given a value; it's a null. Not the same. It's a trivial but key difference I think.

Semantics aside there are occasional times where it's handy and - more importantly - appropriate. Consider the setMethods method of PHPUnit's MockBuilder class. The way this works is that one can pass one of three values to the method:

null - no methods will be mocked-out;
an empty array - all methods will be mocked-out;
an array of method names - just those methods will be mocked-out.

[docs][implementation]

Here the null has meaning, and it's distinct in meaning from an empty array. Even then this only makes "sense" because for convenience the meaning of an empty array is inversed when compared to a partial array: a partial array means "only these ones", an empty array following that logic would mean "only these ones, which is none of them", but it has been given a special meaning (and IMO "special meaning" is something one ought to be hesitant about assigning to a value). Still: it's handy.

Had I been implementing PHPUnit, I probably would not have supported null there. I'd've perhaps had two methods: mockMethods(array subsetToMock) and mockAllMethods(), and if one doesn't specify either of those, then no methods are mocked. No values with special meaning, and clearly-named methods. Shrug. PHPUnit rocks, so this is not an indictment of it.

On the other hand, most of the usage I have seen of nullable arguments is this sort of thing:

function f(?int $x = 0){
    if (is_null($x)){
        $x = 0;
    }
    // rest of function, which requires $x to be an int
}

This is a misuse of the nullable functionality. This function actually requires the argument to be an integer. Null is not a meaningful value to it. So the function simply should not accept a null. It's completely legit for a function to specify what it needs, and require the calling code to actually respect that contract.

I think in the cases I have seen this being done that the dev concerned is also working on the calling code, and they "know" the calling code could quite likely have a null for the value they need to pass to this function, but instead of dealing with that in the calling code, they push the responsibility into the method. This ain't the way to do these things. A method shouldn't have to worry about the vagaries of the situation it's being used in. It should specify what it needs from the calling code to do its job, and that's it.

Another case I've seen is that the dev concerned didn't quite get the difference between "nullable" and having an argument default. They're two different things, and they're conflating the two. When an argument is nullable, one still actually needs to pass a value to it, null or otherwise:

function f(?int $x) : int {
    return $x;
}

try {
    $y = f();
} catch (\TypeError $e) {
    echo $e->getMessage();
}

Output:

Too few arguments to function f(), 0 passed in [...][...] on line 7 and exactly 1 expected

All "nullable" means is it can be null.

I really think the occasions where a nullable argument is legit is pretty rare, so if yer writing code or reviewing someone else's code and come across one being used... stop and think about what's going on, and whether the function is taking on a responsibility the calling code oughta be looking after.

Righto.

--
Adam

↧

PHP: array_map has a dumb implementation

November 26, 2017, 3:40 am

≫ Next: That array_map quandary implemented in other languages

≪ Previous: PHP: misuse of nullable arguments

G'day:
This cropped up a few weeks back, but I've had no motivation to write anything recently so I've sat on it for a bit. It came back to bite me again the other day and my motivation seems to be returning so here we go.

I was working on some code the other day wherein I needed to query some stuff from one DB, and then for the records returned from that, look for records in an entirely different data repository for some "additional data". The best I could do was to get the IDs from the first result and do like a WHERE IN (:ids) of the other data source to get records that were the additional data for some/all of the IDs. There was no one-to-one match between the two data sources, so the second set of results might not have records for all IDs from the first one. It was convenient for me to return the second result not as an indexed array, but as an associative array keyed on the ID. This is easy enough in PHP, and is quite handy:

$peopleData = $conn
    ->query('SELECT date, name FROM mapexampledata')
    ->fetchAll(PDO::FETCH_ASSOC | PDO::FETCH_COLUMN | PDO::FETCH_UNIQUE);

var_dump($peopleData);

Result:

[
"2008-11-08" => "Jacinda",
    "1990-10-27" => "Bill",
    "2014-09-20" => "James",
    "1979-05-24" => "Winston"
]

Note how the array key is the date column from the query, and the value of the array is the other column (or all the subsequent columns, if there were more). I've just indexed by date here to make it more clear that the index key is not just an integer sequence.

From this data, I needed to remap that to be an array of PersonalMilestone objects, rather than an array of raw data. Note that the object has both date and name values:

class PersonalMilestone {
    public $date;
    public $name;

    function __construct($date, $name) {
        $this->date = $date;
        $this->name = $name;
    }
}

(If it's not obvious... this domain model and data is completely unrelated to the work I was actually doing, it's just something simple I concocted for this blog article. The general gist of the issue is preserved though).

"Easy", I thought; "that's just an array_map operation". Oh hohohoho, how funny I am. I forgot I was using PHP.

So I wrote my tests (OK, I did not write the tests for this example for the blog. "IRL" I did though. Always do TDD, remember?), and then went to write the array_map implementation:

$peopleData = ["2008-11-08" => "Jacinda", "1990-10-27" => "Bill", "2014-09-20" => "James", "1979-05-24" => "Winston"];

$people = array_map(function ($name) {
    return new PersonalMilestone(
$date, // dammit this won't work: array_map only gives me the value, not the key
        $name
    );
}, $peopleData);

F*** it. This just beggars my belief as the other languages I have done the same operation on passes both key and value into the map call back. What does PHP do instead? Well it allows you to pass multiple arrays to the array_map function, and the callback receives the value from each of those arrays. I dunno why they chose to implement that as even additional behaviour, let along the default (and only) behaviour. Sigh.

But I thought I could leverage this! I could pass the keys of my array in as a second array, and then I'll have both the values I need to create the PersonalMilestone object. Hurrah!

$peopleData = ["2008-11-08" => "Jacinda", "1990-10-27" => "Bill", "2014-09-20" => "James", "1979-05-24" => "Winston"];

$keys = array_keys($peopleData);

$people = array_map(function ($name, $date) {
    return new PersonalMilestone($date, $name);
}, $peopleData, $keys);

var_dump($people);

Result:

array(4) {
  [0]=>
  object(PersonalMilestone)#2 (2) {
    ["date"]=>
    string(10) "2008-11-08"
    ["name"]=>
    string(7) "Jacinda"
  }
  [1]=>
  object(PersonalMilestone)#3 (2) {
    ["date"]=>
    string(10) "1990-10-27"
    ["name"]=>
    string(4) "Bill"
  }
  [2]=>
  object(PersonalMilestone)#4 (2) {
    ["date"]=>
    string(10) "2014-09-20"
    ["name"]=>
    string(5) "James"
  }
  [3]=>
  object(PersonalMilestone)#5 (2) {
    ["date"]=>
    string(10) "1979-05-24"
    ["name"]=>
    string(7) "Winston"
  }
}

Looks OK right? Look again. The array has been changed into an indexed array. So this isn't a remapping, this is just "making a different array". F***'s sake, PHP.

Initially I thought this was because the two arrays I was remapping had different keys, and thought "aah... yeah... if I was daft I could see how I might implement the thing like this", so I changed the second array to be keyed by its own value (whilst shaking my head in disbelief).

$peopleData = ["2008-11-08" => "Jacinda", "1990-10-27" => "Bill", "2014-09-20" => "James", "1979-05-24" => "Winston"];
var_dump($peopleData);

$keys = array_keys($peopleData);
$keysIndexedByKeys = array_combine($keys, $keys);
var_dump($keysIndexedByKeys);

$people = array_map(function ($name, $date) {
    return new PersonalMilestone($date, $name);
}, $peopleData, $keys);

var_dump($people);

(and, yes, this is a lot of horsing around now, even had it worked which (spoiler) it didn't).

Result:

array(4) {
  ["2008-11-08"]=>
  string(7) "Jacinda"
  ["1990-10-27"]=>
  string(4) "Bill"
  ["2014-09-20"]=>
  string(5) "James"
  ["1979-05-24"]=>
  string(7) "Winston"
}
array(4) {
  ["2008-11-08"]=>
  string(10) "2008-11-08"
  ["1990-10-27"]=>
  string(10) "1990-10-27"
  ["2014-09-20"]=>
  string(10) "2014-09-20"
  ["1979-05-24"]=>
  string(10) "1979-05-24"
}
array(4) {
  [0]=>
  object(PersonalMilestone)#2 (2) {
    ["date"]=>
    string(10) "2008-11-08"
    ["name"]=>
    string(7) "Jacinda"
  }
  [1]=>
  object(PersonalMilestone)#3 (2) {
    ["date"]=>
    string(10) "1990-10-27"
    ["name"]=>
    string(4) "Bill"
  }
  [2]=>
  object(PersonalMilestone)#4 (2) {
    ["date"]=>
    string(10) "2014-09-20"
    ["name"]=>
    string(5) "James"
  }
  [3]=>
  object(PersonalMilestone)#5 (2) {
    ["date"]=>
    string(10) "1979-05-24"
    ["name"]=>
    string(7) "Winston"
  }
}

So the first array and the second array now have the same keys, but... the remapped array still doesn't. This is where we get to the subject line of this article: array_map has a dumb implementation.

I gave up on array_map. I really don't think it's fit for purpose beyond the most obviously simple applications. I just used a foreach loop instead:

$people = [];
foreach ($peopleData as $date => $name) {
    $people[$date] = new PersonalMilestone($date, $name);
}

var_dump($people);

And this works:

array(4) {
  ["2008-11-08"]=>
  object(PersonalMilestone)#1 (2) {
    ["date"]=>
    string(10) "2008-11-08"
    ["name"]=>
    string(7) "Jacinda"
  }
  ["1990-10-27"]=>
  object(PersonalMilestone)#2 (2) {
    ["date"]=>
    string(10) "1990-10-27"
    ["name"]=>
    string(4) "Bill"
  }
  ["2014-09-20"]=>
  object(PersonalMilestone)#3 (2) {
    ["date"]=>
    string(10) "2014-09-20"
    ["name"]=>
    string(5) "James"
  }
  ["1979-05-24"]=>
  object(PersonalMilestone)#4 (2) {
    ["date"]=>
    string(10) "1979-05-24"
    ["name"]=>
    string(7) "Winston"
  }
}

This is simple enough code, but I like having my code more descriptive if possible, so if I'm iterating over an array because I want to remap its values, then I prefer to use a mapping call, not a generic loop. But we need to be pragmatic about these things.

Oh... for the sake of completeness I did work out how to do it "properly" (where "properly" == "passes my tests" not "is good code"):

$keys = new ArrayIterator(array_keys($peopleData));

$people = array_map(function ($name) use ($keys) {
$date = $keys->current();
$keys->next();
    return new PersonalMilestone($date, $name);
}, $peopleData);

(I'll spare you the output... you get the idea by now).

What I've done here is to pass a separate iterator into the callback, and I manually iterate over that each iteration of the callback loop. This gives me the value for the date key for each element of the source array. But that's just shoehorning square peg into a round hole, so I would note that down as a curio, but not actually advocate its use.

I was left with the feeling "PHP, you're f***in' rubbish" (a feeling I get increasingly the more I use it), but I thought just cos I know other languages which do this in a more clear fashion doesn't mean that they're right and PHP is (unhelpfully) wrong. I decided to have a look at how a few other languages tackle the same issue, to see if PHP really does suck in this regard. However that will be the topic for a follow-up article, as this one is already long enough.

Righto.

--
Adam

↧

That array_map quandary implemented in other languages

November 28, 2017, 6:39 am

≫ Next: PHP: perpetual misuse of array type declarations

≪ Previous: PHP: array_map has a dumb implementation

G'day:
A coupla days ago I bleated about array_map [having] a dumb implementation. I had what I thought was an obvious application for array_map in PHP, but it couldn't really accommodate me due to array_map not exposing the array's keys to the callback, and then messing up the keys in the mapped array if one passes array_map more than one array to process.

I needed to remap this:

[
    "2008-11-08" => "Jacinda",
    "1990-10-27" => "Bill",
    "2014-09-20" => "James",
    "1979-05-24" => "Winston"
]

To this:

array(4) {
  '2008-11-08' =>
  class IndexedPerson#3 (2) {
    public $date =>
    string(10) "2008-11-08"
    public $name =>
    string(7) "Jacinda"
  }
  '1990-10-27' =>
  class IndexedPerson#4 (2) {
    public $date =>
    string(10) "1990-10-27"
    public $name =>
    string(4) "Bill"
  }
  '2014-09-20' =>
  class IndexedPerson#5 (2) {
    public $date =>
    string(10) "2014-09-20"
    public $name =>
    string(5) "James"
  }
  '1979-05-24' =>
  class IndexedPerson#6 (2) {
    public $date =>
    string(10) "1979-05-24"
    public $name =>
    string(7) "Winston"
  }
}

Note how the remapped object also contains the original key value. That was the sticking point. Go read the article for more detail and more whining.

OK so my expectations of PHP's array higher order functions are based on my experience with JS's and CFML's equivalents. Both of which receive the key as well as the value in all callbacks. I decided to see how other languages achieve the same end, and I'll pop the codee in here for shits 'n' giggles.

CFML

Given most of my history is as a CFML dev, that one was easy.

peopleData = {"2008-11-08" = "Jacinda", "1990-10-27" = "Bill", "2014-09-20" = "James", "1979-05-24" = "Winston"}

people = peopleData.map((date, name) => new IndexedPerson(date, name))

people.each((date, person) => echo("#date# => #person#<br>"))

Oh, this presupposes the IndexedPerson component. Due to a shortcoming of how CFML works, components must be declared in a file of their own:

component {

    function init(date, name) {
        this.date = date
        this.name = name
    }

    string function _toString() {
        return "{date:#this.date#; name: #this.name#}"
    }
}

But the key bit is the mapping operation:


people = peopleData.map((date, name) => new IndexedPerson(date, name))

Couldn't be simpler (NB: this is Lucee's CFML implementation, not ColdFusion's which does not yet support arrow functions).

The output is:

1979-05-24 => {date:1979-05-24; name: Winston}
2014-09-20 => {date:2014-09-20; name: James}
1990-10-27 => {date:1990-10-27; name: Bill}
2008-11-08 => {date:2008-11-08; name: Jacinda}

Also note that CFML doesn't have associative arrays, it has structs, so the keys are not ordered. This does not matter here.

JS

The next language I turned to was JS as that's the I'm next most familiar with. One thing that hadn't occurred to me is that whilst JS's Array implementation has a map method, we need to use an object here as the keys are values not indexes. And whilst I knew Objects didn't have a map method, I didn't know what the equivalent might be.

Well it turns out that there's no real option to use a map here, so I needed to do a reduce on the object's entries, Still: it's pretty terse and obvious:

class IndexedPerson {
    constructor(date, name) {
        this.date = date
        this.name = name
    }
}

let peopleData = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

let people = Object.entries(peopleData).reduce(function (people, personData) {
    people.set(personData[0], new IndexedPerson(personData[0], personData[1]))
    return people
}, new Map())

console.log(people)

This returns what we want:

Map {
  '2008-11-08' => IndexedPerson { date: '2008-11-08', name: 'Jacinda' },
  '1990-10-27' => IndexedPerson { date: '1990-10-27', name: 'Bill' },
  '2014-09-20' => IndexedPerson { date: '2014-09-20', name: 'James' },
  '1979-05-24' => IndexedPerson { date: '1979-05-24', name: 'Winston' } }

TBH I think this is a misuse of an object to contain basically an associative array / struct, but so be it. It's the closest analogy to the PHP requirement. I was able to at least return it as a Map, which I think is better. I tried to have the incoming personData as a map, but the Map prototype's equivalent of entries() used above is unhelpful in that it returns an Iterator, and the prototype for Iterator is a bit spartan.

I think it's slightly clumsy I need to access the entries value via array notation instead of some sort of name, but this is minor.

As with all my code, I welcome people showing me how I should actually be doing this. Post a comment. I'm looking at you Ryan Guill ;-)

Java

Next up was Java. Holy fuck what a morass of boilterplate nonsense I needed to perform this simple operation in Java. Deep breath...

import java.util.HashMap;
import java.util.Map;
import java.util.stream.Collectors;

class IndexedPerson {
String date;
String name;

    public IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }

    public String toString(){
        return String.format("{date: %s, name: %s}", this.date, this.name);
    }
}

class Collect {

    public static void main(String[] args) {

HashMap<String,String> peopleData = loadData();

HashMap<String, IndexedPerson> people = mapToPeople(peopleData);

        dumpIdents(people);
    }

    private static HashMap<String,String> loadData(){
HashMap<String,String> peopleData = new HashMap<String,String>();

        peopleData.put("2008-11-08", "Jacinda");
        peopleData.put("1990-10-27", "Bill");
        peopleData.put("2014-09-20", "James");
        peopleData.put("1979-05-24", "Winston");

        return peopleData;
    }

    private static HashMap<String,IndexedPerson> mapToPeople(HashMap<String,String> peopleData) {
HashMap<String, IndexedPerson> people = (HashMap<String, IndexedPerson>) peopleData.entrySet().stream()
.collect(Collectors.toMap(
                e -> e.getKey(),
                e -> new IndexedPerson(e.getKey(), e.getValue())
            ));

        return people;
    }

    private static void dumpIdents(HashMap<String,IndexedPerson> people) {
        for (Map.Entry<String, IndexedPerson> entry : people.entrySet()) {
            System.out.println(String.format("%s => %s", entry.getKey(), entry.getValue()));
        }
    }

}

Result:

1979-05-24 => {date: 1979-05-24, name: Winston}
2014-09-20 => {date: 2014-09-20, name: James}
1990-10-27 => {date: 1990-10-27, name: Bill}
2008-11-08 => {date: 2008-11-08, name: Jacinda}

Most of that lot seems to be just messing around telling Java what types everything are. Bleah.

The interesting bit - my grasp of which is tenuous - is the Collectors.toMap. I have to admit I derived that from reading various Stack Overflow articles. But I got it working, and I know the general approach now, so that's good.

Too much code for such a simple thing though, eh?

Groovy

Groovy is my antidote to Java. Groovy makes this shit easy:

class IndexedPerson {
    String date
    String name

    IndexedPerson(String date, String name) {
        this.date = date;
        this.name = name;
    }

    String toString(){
        String.format("date: %s, name: %s", this.date, this.name)
    }
}

peopleData = ["2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"]

people = peopleData.collectEntries {date, name -> [date, new IndexedPerson(date, name)]}

people.each {date, person -> println String.format("%s => {%s}", date, person)}

Bear in mind that most of that is getting the class defined, and the output. The bit that does the mapping is just the one line in the middle. That's more like it.

Again, I don't know much about Groovy… I had to RTFM to find out how to do the collectEntries bit, but it was easy to find and easy to understand.

I really wish I had a job doing Groovy.

Oh yeah, for the sake of completeness, the output was thus:

2008-11-08 => {date: 2008-11-08, name: Jacinda}
1990-10-27 => {date: 1990-10-27, name: Bill}
2014-09-20 => {date: 2014-09-20, name: James}
1979-05-24 => {date: 1979-05-24, name: Winston}

Ruby

Ruby's version was pretty simple too as it turns out. No surprise there as Ruby's all about higher order functions and applying blocks to collections and stuff like that.

class IndexedPerson

    def initialize(date, name)
        @date = date
        @name = name
    end

    def inspect
        "{date:#{@date}; name: #{@name}}\n"
    end
end

peopleData = {"2008-11-08" => "Jacinda", "1990-10-27" => "Bill", "2014-09-20" => "James", "1979-05-24" => "Winston"}

people = peopleData.merge(peopleData) do |date, name|
    IndexedPerson.new(date, name)
end

puts people

Predictable output:

{"2008-11-08"=>{date:2008-11-08; name: Jacinda}
, "1990-10-27"=>{date:1990-10-27; name: Bill}
, "2014-09-20"=>{date:2014-09-20; name: James}
, "1979-05-24"=>{date:1979-05-24; name: Winston}
}

I wasn't too sure about all that block nonsense when I first started looking at Ruby, but I quite like it now. It's easy to read.

Python

My Python skills don't extend much beyond printing G'day World on the screen, but it was surprisingly easy to google-up how to do this. And I finally got to see what Python folk are on about with this "comprehensions" stuff, which I think is quite cool.

class IndexedPerson:
    def __init__(self, date, name):
        self.date = date
        self.name = name

    def __repr__(self):
        return "{{date: {date}, name: {name}}}".format(date=self.date, name=self.name)

people_data = {"2008-11-08": "Jacinda", "1990-10-27": "Bill", "2014-09-20": "James", "1979-05-24": "Winston"}

people = {date: IndexedPerson(date, name) for (date, name) in people_data.items()}

print("\n".join(['%s => %s' % (date, person) for (date, person) in people.items()]))

And now that I am all about Clean Code, I kinda get the "whitespace as indentation" thing too. It's clear enough if yer code is clean in the first place.

The output of this is identical to the Groovy one.

Only one more then I'll stop.

Clojure

I can only barely do G'day World in Clojure, so this took me a while to work out. I also find the Clojure docs to be pretty impentrable. I'm sure they're great if one already knows what one is doing, but I found them pretty inaccessible from the perspective of a n00b. It's like if the PHP docs were solely the user-added stuff at the bottom of each docs page. Most blog articles I saw about Clojure were pretty much just direct regurgitation of the docs, without much value-add, if I'm to be honest.

(defrecord IndexedPerson [date name])

(def people-data (array-map "2008-11-08" "Jacinda" "1990-10-27" "Bill" "2014-09-20" "James" "1979-05-24" "Winston"))

(def people
  (reduce-kv
    (fn [people date name] (conj people (array-map date (IndexedPerson. date name))))
    (array-map)
    people-data))

(print people)

The other thing with Clojure for me is that the code is so alien-looking to me that I can't work out how to indent stuff to make the code clearer. All the examples I've seen don't seem very clear, and the indentation doesn't help either, I think. I guess with more practise it would come to me.

It seems pretty powerful though, cos there's mot much code there to achieve the desired end-goal.

Output for this one:

{2008-11-08 #user.IndexedPerson{:date 2008-11-08, :name Jacinda},
1990-10-27 #user.IndexedPerson{:date 1990-10-27, :name Bill},
2014-09-20 #user.IndexedPerson{:date 2014-09-20, :name James},
1979-05-24 #user.IndexedPerson{:date 1979-05-24, :name Winston}}

Summary

This was actually a very interesting exercise for me, and I learned stuff about all the languages concerned. Even PHP and CFML.

I twitterised a comment regarding how pleasing I found each solution:

In order of ease of use and how pleasing the solution seemed:
1 Python
2 Ruby
3 Groovy
4 CFML
5 JS
6 PHP
7 Java
— Adam Cameron (@adam_cameron) November 25, 2017

This was before I did the Clojure one, and I'd slot that in afer CFML and before JS, making the list:

Python
Ruby
Groovy
CFML
Clojure
JS
PHP
Java

Python's code looks nice and it was easy to find out what to do. Same with Ruby, just not quite so much. And, really same with Groovy. I could order those three any way. I think Python tips the scales slightly with the comprehensions.

CFML came out suprisingly well in this, as it's a bloody easy exercise to achieve with it.

Clojure's fine, just a pain in the arse to understand what's going on, and the code looks a mess to me. But it does a lot in little space.

JS was disappointing because it wasn't nearly so easy as I expected it to be.

PHP is a mess.

And - fuck me - Java. Jesus.

My occasional reader Barry O'Sullivan volunteered some input the other day:

Cool. If you'd like a Golang example just let me know. Be fun to collaborate.
— Barry O Sullivan (@barryosull) November 26, 2017

Hopefully he's still up for this, and I'll add it to the list so we can have a look at that code too.

Like I said before, if you know a better or more interesting way to do this in any of the languages above, or any other languages, make a comment and post a link to a Gist (just don't put the code inline in the comment please; it will not render at all well).

I might have another one of these exercises to do soon with another puzzle a friend of mine had to recently endure in a job-interview-related coding test. We'll see.

Righto.

--
Adam

↧

PHP: perpetual misuse of array type declarations

May 5, 2018, 4:24 am

≫ Next: In defence of public properties

≪ Previous: That array_map quandary implemented in other languages

G'day:
This has been irking me enough recently to cause me to dust off this old blog.

For a while PHP has had limited argument type checking on function arguments, and this was improved in PHP 7. The docs cover it - Function arguments › Type declarations - so I'll not repeat them here. But we can have this code:

function takesArray(array $a){
    return $a;
}

var_dump(takesArray(["tahi", "rua", "toru", "wha"]));

Output:

array(4) {
  [0]=>
  string(4) "tahi"
  [1]=>
  string(3) "rua"
  [2]=>
  string(4) "toru"
  [3]=>
  string(3) "wha"
}

And if we try to pass something other than an array:

<?php
function takesArray(array $a){
    return $a;
}

try {
    $a = takesArray("NOT_AN_ARRAY");
} catch (\Throwable $t) {
    echo $t;
}

We get this:

TypeError:
Argument 1 passed to takesArray() must be of the type array, string given,
called in [...][...] on line 7 and defined in [...][...]:2
Stack trace:
#0 [...][...](7): takesArray('NOT_AN_ARRAY')
#1 {main}

Fair enough. No complaint there.

What I have an issue with is that now I see a lot of devs slapping "array" as an parameter type in every method signature that takes an array. What's wrong with that you probably ask? Well... in most situations they're doing this, it's not simply an array that is the requirement. It's generally one of these two situations:

function getFirstName(array $name){
    return $name['firstName'];
}

$firstName = getFirstName(["lastName" => "Cameron", "firstName" => "Adam"]);

echo $firstName;

OK, this is a really contrived example, but it's simple and demonstrates the point. Here the function takes an array, and returns the firstName from the array. But note that the requirement of the argument is not "it's an array", as stated. It very specifically needs to be "an associative array that has a key firstName". "array" is not correct here. It's not at all helpful. If one is needing to do this sort of thing, then create a class Name, and specify that. Because that's what you actually want here:

class Name {
    public $lastName;
    public $firstName;

    function __construct($lastName, $firstName) {
        $this->lastName = $lastName;
        $this->firstName = $firstName;
    }
}

function getFirstName(Name $name){
    return $name->firstName;
}

The second bogus usage of array as a type declaration is with indexed arrays. Consider this:

function filterOutEarlierDates(array $dates, int $year){
    $cutOff = new \DateTime("$year-01-01");

    $filtered = array_filter($dates, function (\DateTimeInterface $date) use ($cutOff) {
        $diff = $date->diff($cutOff);
        return $diff->invert === 1;
    });

    return array_values($filtered);
}

$dobs = [
    new \DateTime("1968-12-20"),
    new \DateTime("1970-02-17"),
    new \DateTime("2011-03-24"),
    new \DateTime("2016-08-17")
];
$cutOff = 2011;

$youngins = filterOutEarlierDates($dobs, $cutOff);

var_dump($youngins);

Here filterOutEarlierDates claims it takes an array. Well it does. Again this lacks the necessary precision for the requirement. It doesn't take an array. It takes - specifically - an array of DateTimeInterface objects. If you give it anything else than that, the code breaks. As type checking is specifically to guard against that, specifying array is simply neither correct or even helpful here.

Specifying array as an argument type check does have a place. When what the function needs actually is an array. And just an array. Any old array.

function array_some(array $array, callable $callback) {
    foreach ($array as $key => $value) {
        $result = $callback($value, $key);
        if ($result) {
            return true;
        }
    }
    return false;
}

$colours = ["whero", "karaka", "kowhai", "kakariki", "kikorangi", "poropango", "papura"];

$containsSixLetterWords = array_some($colours, function ($colour) {
   return strlen($colour) === 6; 
});

var_dump($containsSixLetterWords);

$numbers = [
    "one" => "tahi",
    "two" => "rua",
    "three" => "toru",
    "four" => "wha",
    "five" => "rima",
    "six" => "ono",
    "seven" => "whitu",
    "eight" => "ware",
    "nine" => "iwa",
    "ten" => "tekau"
];

$hasShortKeys = array_some($numbers, function ($number, $key) {
   return strlen($key) <= 3; 
});

var_dump($hasShortKeys);

Here's a PHP equivalent of JS's some method. It takes an array (any array) and a callback, and applies the callback to each array entry until the callback returns true, then it exits. This function correctly claims the first argument value needs to be an array. Without any other sort of qualification: it just needs to be an array.

Actually there's a similar issue with my type checking of the $callback parameter there. It can't be any callable: it needs to be a callable that takes a coupla arguments and returns a boolean. In C# I guess we'd use a Delegate for that, but there's no such construct in PHP. So I should probably settle for giving the argument a clear name ($callback probably isn't good enough here), and dispense with the type check.

But even then I question the merits of type-checking array here. What if instead of just an array, we had a collection object that implements \Iterator, eg:

class ColourCollection implements \Iterator {
    private $position = 0;
    private $colours;  

    public function __construct($colours) {
        $this->position = 0;
        $this->colours = $colours;
    }

    public function rewind() {
        $this->position = 0;
    }

    public function current() {
        return $this->colours[$this->position];
    }

    public function key() {
        return $this->position;
    }

    public function next() {
        ++$this->position;
    }

    public function valid() {
        return isset($this->colours[$this->position]);
    }
}

$rainbow = new ColourCollection($colours);

If we run this variation of the code, we get:

Fatal error:
Uncaught TypeError: Argument 1 passed to array_some() must be of the type array, object given,
 called in [...][...] on line 48 and defined in [...][...]:2
Stack trace:
#0 [...][...](48): array_some(Object(ColourCollection), Object(Closure))
#1 {main}
  thrown in [...][...] on line 2

If we take the type check out: the code works. So how is the array type check really helping us here? It's not.

To be honest I think I'm being a bit edge-case-y with that last example. I don't see a lot of objects that implement the iterator class: we just stick with arrays instead. But it is a legit consideration I guess.

I would use array as a type check in that array_some situation - where the argument actually just needs to be an array, with no special other qualities about it. But this is very rare. In other cases where the argument value needs to be "some specific type or array", then I think it's dead wrong to have an array type check there.

Righto.

--
Adam

↧

In defence of public properties

May 7, 2018, 4:30 am

≫ Next: PHP: Expectation management with Symfony Validator validation

≪ Previous: PHP: perpetual misuse of array type declarations

G'day:
Recently I was re-writing one of our cron job scripts. It's the sort of thing that runs at 2am, reads some stuff from the DB, analyses it, and emails some crap to interested parties. We usually try to avoid complete rewrites, but this code was written... erm... [cough insert some period of time ago when it was apparently OK to write really shit code with no code quality checks and no tests]... and it's unmaintainable, so we're biting the bullet and rewriting coherently whilst adding some business logic changes into it.

The details for the email - to, from, subject, data to build the body from - come from three different sources in total, before sending to the mailer service to be queued-up for sending. So I knocked together a simple class to aggregate that information together:

class SpecialInterestingReportEmail {

    public $from;
    public $to;
    public $subject;
    public $body;

    public static $bodyTemplate = "specialInterestingReportEmail.html.twig";

    function __construct($from, $to, $subject, $body) {
        $this->from = $from;
        $this->to = $to;
        $this->subject = $subject;
        $this->body = $body;
    }
}

I then have a factory method in an EmailMessageService (not entirely sure that's the best name for it, but [shrug]) which chucks all the bits and pieces together, and returns an email object:

function createSpecialInterestingReportEmail($special, $interesting) {
    $body = $this-twigService->render(
        SpecialInterestingReportEmail::bodyTemplate,
        [
            "special" => $special,
            "interesting" => $interesting
        ]
    )
    return new SpecialInterestingReportEmail(
        $this->addressesForReport->from,
        $this->addressesForReport->to,
        $this->reportSubject,
        $body
    );
}

I don't think this design is perfect, but in the context of what we're doing it's OK.

Notice:
All interaction I mention below with my colleagues is paraphrased, embellished, or outright changed to make my point. I am not quoting anyone, and take this as a work of fiction. It's also drawn from previous similar conversations I've had on this topic.

One of my code reviewers was horrified:

Code Reviewer: "You can't have public properties! That's bad OOP!"
Me: "Why?"
CR: "You're breaking encapsulation. You need to do it like this [proceeds to tutor me on how to write accessor methods, because, like, I needed a tutorial on design anti-patterns]".
Me: "Why? What are we gaining by making those properties private, just to force us to write a bunch of boilerplate code to then access them? And how is that a good change to make to this code that now exists and is fulfilling the requirement? This is code review... we've already done the planning for this. This class is just a single container to aggregate some data into a conceptual 'email message' object. It doesn't need any behaviour - it doesn't need any methods - it's just for moving data between services".
CR: "But you can't break encapsulation!"
Me: "Why not? Other than parroting dogma read from some OOP 101 book, why is this a problem? And how's it solved by accessor methods?"
CR: "Well someone could directly change the values to be wrong!"
Me: "Well... they probably shouldn't do that then. That'd be dumb. They could equally put the wrong values straight into the constructor too. It'd still be just as wrong. Look... we don't need to validate the data - that seems to be your concern here? - as it's just a script reading known-good values from the DB, and sending them to our email queue. The code's right there. There's no scope for the data to accidentally be wrongified. And if it was... the tests would pick it up anyhow".
CR: "But what if other code starts using this code?"
Me: "What... yer saying 'what it some other application starts using this cronjob as some sort of library?' Why would that happen? This is not a public API. It makes no pretence of being a public API. If anyone started using this as an API, they deserve everything they get".
CR: "But they might".
ME: "OK, let's say some of this code is gonna be needed somewhere else, and we need to make it into a library. At that point in time, we'd extract the relevant code, consider stuff like encapsulation, data hiding, providing a public interface etc. But that would be different code from this lot".
CR: "But you should write all code like it will be used that way".
Me: "Again: why? This code is not going to be used this way. It just isn't. And anyhow, what yer suggesting is YAGNI / coding-for-an-unknown-future anyhow. We don't gain anything for the purposes of the current requirement chucking pointless boilerplate code into these classes. That's not an improvement".

And it went on.

One problem I encounter fairly frequently - both at work and in the wild - is people who will read about some concept, and thenceforth That Concept Is Law. Someone has Said It, therefore it applies to every single situation thereafter. They don't ever seem to bother trying to reason why the "law" was stated in the first place, what about it makes it a good rule, and when perhaps it's not necessary.

I look at these things and think "will this benefit my current task?". Generally it does because these things don't acquire acceptance without scrutiny and sanity-checking. But sometimes, however, it doesn't make sense to follow the dogma.

In this situation: it does not help.

In a different situation, if I was writing a separate API which handled the Email object creation, and other apps were gonna use it, I'd've possibly tightened-up the property access. But only possibly. My position on such things is to be liberal with what one permits to be done with code. If all my theoretical accessor method was gonna achieve was to return a private value... I'd really consider just leaving it public instead, and allow direct access. Why not?

There's a risk that later I might need to better control access to those properties, but... I'd deal with that at the time: these things can be handled as/when. It's even easy enough to have transitionary code from direct-access to accessor-access using __get and __set. I know these are frowned-upon, but in transitionary situations: they're fine. So one could seamlessly patch the API for code already consuming it via direct access with that transitionary approach, and then in the next (or some subsequent ~) version, make the breaking change to require using accessors, eg:

v1.0 - direct access only.
v1.1 - add in transitionary code using __get and __set. Advise that direct access is deprecated and will be removed in the next release. Also add in accessor methods.
v2.0 - remove direct access.

It doesn't even need to be v2.0. Just "some later version". But for the sake of making sure the transitionary code is temporary, better to do sooner rather than later. The thing is that software versioning is there for a reason, so it's OK to only introduce necessary coding overhead when it's needed.

Another thing that occurred to me when I was thinking about this. Consider this code:

$email = [
    "from" => "from@example.com",
    "to" => "to@example.com",
    "subject" => "Example subject",
    "body" => "Example body"
];

$fromAddress = $email["from"];

Perfectly fine. So how come this code is not fine:

$email = new Email(
    "from@example.com",
    "to@example.com",
    "Example subject",
    "Example body"
);

$fromAddress = $email->from;

Why's it OK to access an array directly, but it's not - apparently - OK to give the collection of email bits a name (Email), and otherwise use it the same way?

I can't think of one.

Rules are great. Rules apply sensibly 95% of the time. But when one reads about a rule... don't just stop there... understand the rule. The rules are not there simply for the sake of enabling one to not then think about things. Always think about things.

Righto.

--
Adam

PS: completely happy to be schooled as to how I am completely wrong here. This is half the reason I wrote this.

↧

PHP: Expectation management with Symfony Validator validation

July 4, 2018, 12:26 am

≫ Next: I need to backtrack on some TDD guidance. I think

≪ Previous: In defence of public properties

G'day:
I was checking out of of our web services the other day, replicating a bug I was fixing, and found the need to guess at what the body content needed to be for a POST I was doing. Our validation is solid, so I could just start with an empty object, and the validation would tell me what I needed to pass. Or, yes, I could RTFM but I couldn't be arsed.

Anyhow, this tactic worked for the first few properties: the validation came back with "you need this property"... "nah, it needs to be an integer"... "that's too low", etc. Cool. But then I passed an empty string for one of the properties, and instead of getting a 400 back with a hint as to what the constraints needed to be; I instead got a 500 server error. F***'s sake.

{
    "errors": [
        "Internal Server Error"
    ]
}

This is good and bad. Bad it was a 500 for some reason; good that there's no detail bleeding out. This was part of what I was fixing.

Looking at the logs I was getting this:

{"errors": ["Expected argument of type \"array or Traversable and ArrayAccess\", \"string\" given"]}

Indeed I was passing in a string:

{
  "firstName" : "Zachary",
  "lastName" : "Cameron Lynch",
  "address" : ""
}

(not the actual data in question, but it's work code so I cannot share it). Right so address is supposed to be an object, eg:

{
    "firstName" : "Zachary",
    "lastName" : "Cameron Lynch",
    "address" : {
        "street" : "123 Some Street",
        "city" : "Our City"
    }
}

That sort of thing.

But anyhow, why's the valiation failing? Well not failing - I'd expect / want that - but why is it erroring? I assumed we were doing something wrong: not validating it was an object, and just charging off and using it. I looked into it, and our code seemed legit, so start trying to make a solid repro case so I can ask people out in the world about it.

I know nothing about Symfony's Validator component. I have code-reviewed other people using it, so am vaguely aware of how it works, but I've not really used it. First things first then... the docs.

I concocted this test case from the introductory docs:

use PHPUnit\Framework\TestCase;
use Symfony\Component\Validator\Constraints\Length;
use Symfony\Component\Validator\Constraints\NotBlank;
use Symfony\Component\Validator\Validation;

class StringValidatorTest extends TestCase
{
    public function testExampleFromDocumentation()
    {
        $validator = Validation::createValidator();
        $violations = $validator->validate(
            'Bernhard',
            [
                new Length(['min' => 10]),
                new NotBlank(),
            ]
        );

        $this->assertCount(1, $violations);
        $this->assertEquals(
            'This value is too short. It should have 10 characters or more.',
            $violations[0]->getMessage()
        );
    }
}

And this worked as expected:

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

.                                                                   1 / 1 (100%)

Time: 218 ms, Memory: 4.00MB

OK (1 test, 2 assertions)

Also this is pretty straight fwd. Create a validator, pass a value and some constraints to a validate, and... job done.

From there I needed to test the behaviour of passing the wrong data type when expected. I decided expecting an integer and then not giving it one would be the easiest test here:

First a class to wrap-up the validator with the constraints in question:

namespace me\adamcameron\general\validation\collectionIssue;

use Symfony\Component\Validator\Constraints\Type;
use Symfony\Component\Validator\Validation;

class IntegerValidator
{

    private $validator;
    private $constraints;

    public function __construct()
    {
        $this->validator = Validation::createValidator();
        $this->setConstraints();
    }

    private function setConstraints()
    {
$this->constraints = [
new Type(['type' => 'integer'])
];
    }

    public function validate($value)
    {
        return $this->validator->validate($value, $this->constraints);
    }
}

Nothing surprising there. I have a constraints collection which is simply "gotta be an integer". And a method that takes a value and validates it against those constraints.

And then some tests:

<?php

namespace me\adamcameron\general\test\validation\collectionIssue;

use me\adamcameron\general\validation\collectionIssue\IntegerValidator;
use PHPUnit\Framework\TestCase;

class IntegerValidatorTest extends TestCase
{
    use ViolationAssertions;

    private $validator;

    protected function setUp()
    {
        $this->validator = new IntegerValidator();
    }

    /** @dataProvider provideCasesForValidateTests */
    public function testValidate($validValue)
    {
        $violations = $this->validator->validate($validValue);
        $this->assertHasNoViolations($violations);
    }

    public function provideCasesForValidateTests()
    {
        return [
            'has value' => ['value' => 42],
            'is null' => ['value' => null]
        ];
    }

    /** @dataProvider provideCasesForViolationsTests */
    public function testValidateReturnsViolationsWhenConstraintsBroken($invalidValue)
    {
        $violations = $this->validator->validate($invalidValue);
        $this->assertHasViolations($violations);
    }

    public function provideCasesForViolationsTests()
    {
        return [
            'string' => ['value' => 'forty-two'],
            'integer string' => ['value' => '42'],
            'float' => ['value' => 4.2],
            'integer as float' => ['value' => 42.0],
            'array' => ['value' => [42]]
        ];
    }
}

So I have two tests: valid values; invalid values. I expect no violations from the valid values, and some violations from the invalid values (I don't care what they are, just that there's a violation when the value ain't legit):

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

.......                                                             7 / 7 (100%)

Time: 204 ms, Memory: 4.00MB

OK (7 tests, 14 assertions)

(Oh, I whipped a coupla custom assertions out into a trait):

namespace me\adamcameron\general\test\validation\collectionIssue;

use Symfony\Component\Validator\ConstraintViolationList;

trait ViolationAssertions
{
    public function assertHasNoViolations($violations)
    {
        $this->assertInstanceOf(ConstraintViolationList::class, $violations);
        $this->assertEmpty($violations);
    }

    public function assertHasViolations($violations)
    {
        $this->assertInstanceOf(ConstraintViolationList::class, $violations);
        $this->assertNotEmpty($violations);
    }
}

Bottom line, as you can see from the test results, one can pass an invalid value to a validator, and it won't error; it'll just return you yer violation. Cool.

So what's the go with a collection then? Why are we getting that error?

I concocted a similar CollectionValidator:

namespace me\adamcameron\general\validation\collectionIssue;

use Symfony\Component\Validator\Constraints\Collection;
use Symfony\Component\Validator\Constraints\Type;
use Symfony\Component\Validator\Validation;

class CollectionValidator
{

    private $validator;
    private $constraints;

    public function __construct()
    {
        $this->validator = Validation::createValidator();
        $this->setConstraints();
    }

    private function setConstraints()
    {
        $this->constraints = [
            new Collection([
                'fields' => [
                    'field1' => [new Type(['type'=>'integer'])]
                ],
                'allowMissingFields' => true,
                'allowExtraFields' => true
            ])
        ];
    }

    public function validate($value = null)
    {
        return $this->validator->validate($value, $this->constraints);
    }
}

This time I'm defining a "collection" (read: "array" in this case) which could have a property field1, and if it does; it needs to be an integer. It could have other properties too: don't care. Or indeed field1 could be missing: don't care. But if it's there... it needs to be an integer.

And we test this. I'll break the class down this time so I can discuss:

namespace me\adamcameron\general\test\validation\collectionIssue;

use me\adamcameron\general\validation\collectionIssue\CollectionValidator;
use PHPUnit\Framework\TestCase;

class CollectionValidatorTest extends TestCase
{
    use ViolationAssertions;

    private $validator;

    protected function setUp()
    {
        $this->validator = new CollectionValidator();
    }

    /** @dataProvider provideCasesForValidateEmptyCollectionTests */
    public function testValidateWithValidButEmptyCollectionTypeIsOk($validValue)
    {
        $violations = $this->validator->validate($validValue);
        $this->assertHasNoViolations($violations);
    }

    public function provideCasesForValidateEmptyCollectionTests()
    {
        return [
            'array' => ['value' => []],
            'iterator' => ['value' => new \ArrayIterator()]
        ];
    }
}

First just a baseline test with some empty (but valid) values. These pass:

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

..                                                                  2 / 2 (100%)

Time: 187 ms, Memory: 4.00MB

OK (2 tests, 4 assertions)

Next I test a fully-legit collection with a valid integer value for the field1 value.

public function testValidateWithIntegerField1ValueShouldPass()
{
    $violations = $this->validator->validate(['field1'=>42]);
    $this->assertHasNoViolations($violations);
}

All good:

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

.                                                                   1 / 1 (100%)

Time: 211 ms, Memory: 4.00MB

OK (1 test, 2 assertions)

And now with the correct field1 property, but without an integer. This should have violations

public function testValidateWithNonIntegerField1ValueShouldHaveViolation()
{
    $violations = $this->validator->validate(['field1'=>'NOT_AN_INTEGER']);
    $this->assertHasViolations($violations);
    $this->assertSame('This value should be of type integer.', $violations[0]->getMessage());
}

Results:

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

.                                                                   1 / 1 (100%)

Time: 191 ms, Memory: 4.00MB

OK (1 test, 3 assertions)

And now one without field1, but with another property:

public function testValidateWithOnlyOtherFieldsShouldPass()
{
    $violations = $this->validator->validate(['anotherField'=>'another value']);
    $this->assertHasNoViolations($violations);
}

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

.                                                                   1 / 1 (100%)

Time: 191 ms, Memory: 4.00MB

OK (1 test, 2 assertions)

And for good measure I chuck it a null and see what happens:

public function testValidateWithNullIsApparentlyOK()
{
    $violations = $this->validator->validate(null);
    $this->assertHasNoViolations($violations);
}

Not sure this is legit IMO - a null is not a collection as per my constraint definition. But... well... not what I care about right now. It passes:

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

.                                                                   1 / 1 (100%)

Time: 259 ms, Memory: 4.00MB

OK (1 test, 2 assertions)

OK so at this point I reckon I have done my constraints properly as my tests all pass.

And now the test that replicates our problem:

public function testValidateWithNonTraversableShouldCauseViolation()
{
    $violations = $this->validator->validate('a string');
    $this->assertHasViolations($violations);
}

Note that here I am not passing a collection value at all. I am just passing a string. So this should have a violation, right? Nuh-uh:

PHPUnit 7.2.6 by Sebastian Bergmann and contributors.

E                                                                   1 / 1 (100%)

Time: 194 ms, Memory: 4.00MB

There was 1 error:

1) me\adamcameron\general\test\validation\collectionIssue\CollectionValidatorTest::testValidateWithNonTraversableShouldCauseViolation
Symfony\Component\Validator\Exception\UnexpectedTypeException: Expected argument of type "array or Traversable and ArrayAccess", "string" given

C:\src\php\general\vendor\symfony\validator\Constraints\CollectionValidator.php:37
C:\src\php\general\vendor\symfony\validator\Validator\RecursiveContextualValidator.php:829
C:\src\php\general\vendor\symfony\validator\Validator\RecursiveContextualValidator.php:675
C:\src\php\general\vendor\symfony\validator\Validator\RecursiveContextualValidator.php:118
C:\src\php\general\vendor\symfony\validator\Validator\RecursiveValidator.php:100
C:\src\php\general\src\validation\collectionIssue\CollectionValidator.php:36
C:\src\php\general\test\validation\collectionIssue\CollectionValidatorTest.php:63

ERRORS!
Tests: 1, Assertions: 0, Errors: 1.

So let me get this straight. I am passing validation code a value that ought not validate. And what I get is not a "it dun't validate mate" violation, the validation lib just breaks.

Grumble.

Now this just seems to me to be a testing-101-fail on the part of the Validator library. One must always be able to give the validator any value, and the result should just be "yeah all good", or "nup. And here's why". However as the Symfony mob usually do such solid work, I am kinda wondering if I'm messing something up, and just not seeing it?

What do you think? Even if you don't know PHP or Symfony validation, what would your expectations be here? Or can yous ee the glaring error that I can't see from having looked at this too long?

I will presently ask a question on this on Stack Overflow (a heavily-abridged version of this), but wanted to get some more friendly eyes on it first.

Update

This was bugging my colleague so he went in and checked the underlying Symfony code - something I should have done - and the CollectionValidator is very specifically throwing this exception very early in the piece:

public function validate($value, Constraint $constraint)
{
if (!$constraint instanceof Collection) {
throw new UnexpectedTypeException($constraint, __NAMESPACE__.'\Collection');
}
    if (null === $value) {
        return;
    }
    if (!is_array($value) && !($value instanceof \Traversable && $value instanceof \ArrayAccess)) {
        throw new UnexpectedTypeException($value, 'array or Traversable and ArrayAccess');
    }

That strikes me as very daft, and contrary to the intend of a validator lib. It's been there since the very first commit of that code, as far as I can tell (back in 2010). It's clearly their intent, but the intent is misguided IMO.

Cheers.

--
Adam

↧

I need to backtrack on some TDD guidance. I think

November 9, 2019, 10:29 am

≫ Next: PHPUnit: trying to be clever with data providers and failing

≪ Previous: PHP: Expectation management with Symfony Validator validation

G'day:
Yes, this thing still exists.

Possibly foolishly my employer has made me tech lead of one of our software applications, and part of that is to lead from the front when it comes to maintaining code quality, including our TDD efforts. So I tend to pester the other devs about TDD techniques and the like. O how they love that. Some of them. Occasionally. Actually seldom. Poor bastards.

Anyway, recently I started to think about the difference between "fixing tests that would break if we made this new change to the code" and "writing a TDD-oriented test case before making the code change". Just to back up a bit... this is in the context of "write a failing test for the case being tested... then writing the code to pass the test". The red and green from the "red green refactor" trope. This started from doing code review and seeing existing tests being update to accommodate new code changes, but not see any actual new test case explicitly testing the new requirement. This did not sit well with me, so I've been asking the devs "where's yer actual test case for the change we're making here?"

To help with this notion I wrote a quick article on WTF I was on about. I was initially pleased with it and thought I'd nailed it, but when I asked my more-knowledgeable colleague Brian Sadler to sanity check it, he wasn't convinced I had nailed it, and moreover wasn't sure I was even right. Doh. I've been mulling over this for the rest of the week and I've come to the conclusion... yeah, he's right and I'm wrong. I think. I'm writing this article to put my thinking down in public, and see what you lot think. If anyone still reads this I mean. I will be chasing some ppl to read it, that said...

I can't just reproduce the original article cos it was an internal document and work owns it. So I'm reproducing a similar thing here.

Here's some sample code. I stress the code used here bears no relation to any code or situation we have in our codebase. It's just a simplified case to contextualise the discussion:

class ThingRepository {

    public function getThing($id) {
$dao = DaoFactory::getThingDao();
        $dbData = $dao->getThing($id);

        return Thing::createFromArray($dbData);
    }
}

It gets raw data for the requested thing from a DAO and returns the data as a Thing object.

It also has an existing test:


class ThingRepositoryTest extends TestCase {

    private $repository;

    protected function setUp() : void {
        $this->repository = new ThingRepository();
    }

    public function testGetThingReturnsCorrectThingForId(){
        $testId = 17;
        $expectedName = "Thing 1";
        $thing = $this->repository->getThing(17);

        $this->assertInstanceOf(Thing::class, $thing);
        $this->assertSame($testId, $thing->id);
        $this->assertSame($expectedName, $thing->name);
    }
}

And this all works fine:

C:\src\php\general>vendor\bin\phpunit test\tdd\fixingExistingTests\ThingRepositoryTest.php
PHPUnit 8.4.3 by Sebastian Bergmann and contributors.

. 1 / 1 (100%)

Time: 54 ms, Memory: 4.00 MB

OK (1 test, 3 assertions)

C:\src\php\general>

If you were paying attention, you'll've spotted we're tightly coupling the DAO implementation here to its repository. At least we're using a factory, but still it is tightly coupled to the Repository implementation. We want to use dependency injection to provide the DAO to the repository.

Here's where my stance gets contentious.

Someone has addressed this requirement, and what they've done is updated that existing test to expect the passed-in DAO:

class ThingRepositoryExistingTestUpdatedTest extends TestCase {

    private $repository;
    private $dao;

    protected function setUp() : void {
        $this->dao = $this->createMock(ThingDao::class);
        $this->repository = new ThingRepositoryWithDaoDependency($this->dao);
    }

    public function testGetThingReturnsCorrectThingForId(){
// implementation that now uses mocked DAO
    }
}

Note that this is exactly how I presented this information in the original work article. With the test implementation changes elided. This becomes relevant later.

Having prepped that test - and if fails nicely, cool - the dev then goes and sorts out the repo class.

I went on to reject that approach as not being valid TDD. My position is that "fixing other tests so they won't fail after we make the change" is not TDD, it's just technical debt recovery. If we're doing TDD, then we need to start with an explicit test case for the change we're going to make. I'll put this in a highlight box to draw attention to it:

Fixing other tests so they won't fail after we make the change is not TDD, it's just technical debt recovery. If we're doing TDD, then we need to start with an explicit test case for the change we're going to make.

This is what I'd expect to see:

class ThingRepositoryTest extends TestCase {

    private $repository;
    private $dao;

    protected function setUp() : void {
$this->dao = $this->createMock(ThingDao::class);
$this->repository = new ThingRepositoryWithDaoDependency($this->dao);
    }

public function testGetThingUsesDaoPassedIntoConstructor(){
        $testId = 17;
        $mockedThingData = ['id'=>$testId, 'thing' => 'stuff'];

$this->dao
->expects($this->once())
->method('getThing')
->with($testId)
->willReturn($mockedThingData);

        $result = $this->repository->getThing($testId);

        $this->assertEquals(
            new Thing($mockedThingData['id'], $mockedThingData['thing']),
            $result
        )
    }

    // ... rest of test cases that were already there, including this one...

    public function testGetThingReturnsCorrectThingForID(){
        // updated implementation that doesn't break because of the DAO change
    }
}

Here we have an explicit test case for the change we're making. This is TDD. We're explicitly injecting a DAO into the test repo, and we're checking that DAO is being used to get the data.

This test is red:

C:\src\php\general>vendor\bin\phpunit test\tdd\fixingExistingTests\ThingRepositoryTest.php
PHPUnit 8.4.3 by Sebastian Bergmann and contributors.

F 1 / 1 (100%)

Time: 61 ms, Memory: 4.00 MB

There was 1 failure:

1) me\adamcameron\general\test\tdd\fixingExistingTests\ThingRepositoryTest::testGetThingUsesDaoPassedIntoConstructor
Expectation failed for method name is "getThing" when invoked 1 time(s).
Method was expected to be called 1 times, actually called 0 times.

FAILURES!
Tests: 1, Assertions: 4, Failures: 1.

C:\src\php\general>

But that means we're now good to make the changes to the repository:

class ThingRepository {

    private $dao;

    public function __construct($dao) {
        $this->dao = $dao;
    }

    public function getThing($id){
        $dbData = $this->dao->getThing($id);

        return Thing::createFromArray($dbData);
    }
}

Now our test is green, and we're done.

I rounded out the article with this:

As a rule of thumb... if you don't start your TDD work by typing in a case name - eg testGetThingUsesDaoPassedIntoConstructor - then you're not doing TDD, you are just doing maintenance code. These are... not the same thing.

Brian read the article and - my words not his - didn't get the distinction I was trying to make. Why was the first test update not sufficient? I could not articulate myself any differently than repeat what I'd said - no, not helpful - so we decided to get together later and look at how the implementation of the test I had elided would pan out, and how that wouldn't itself be a decent test. Due to work being busy this week we never had that confab, but I continued to think about it, and this morning concluded that Brian was right, and I was mistaken.

It comes down to how I omitted the implementation of that first test we updated. I did not do this on purpose, I just didn't want to show an implementation that was not helpful to the article. But in doing that, I never thought about what the implementation would be, and set-up a straw man to support my argument. I went back this morning and implemented it...

public function testGetThingReturnsCorrectThingForId(){
    $testId = 17;
    $expectedName = "Thing 1";

    $this->dao->expects($this->once())
        ->method('getThing')
        ->with($testId)
        ->willReturn(['id' => $testId, 'name' => $expectedName]);

    $thing = $this->repository->getThing(17);

    $this->assertInstanceOf(Thing::class, $thing);
    $this->assertSame($testId, $thing->id);
    $this->assertSame($expectedName, $thing->name);
}

This is what I'd need to do to that first test to get it to not break when we change the DAO to being injected. And... erm... it's the same frickin' logic as in the test I said we need to explicitly make. I mean I wrote the two versions at different times so solved the same problem slightly differently, but it's the same thing for all intents and purposes.

So... let's revisit my original position then:

Fixing other tests so they won't fail after we make the change is not TDD, it's just technical debt recovery. If we're doing TDD, then we need to start with an explicit test case for the change we're going to make.

It's not intrinsically true. Actually fixing existing tests can indeed be precisely the TDD-oriented cover we need.

I love being wrong. No sarcsm there: I really do. Now I'm thinking better about this topic. Win.

The only thing left after the distillation is the difference between these two:

Original test name (detailing its original test case requirement):

testGetThingReturnsCorrectThingForId

And my suggested case name for the new injected-DAO requirement:

testGetThingUsesDaoPassedIntoConstructor

Now I do still think there's merit in being that explicit about what we're testing. But it's well into the territory of "perfect is the enemy of good". We're fine just updating that original test.

I thought of how to deal with this, and came up with this lunacy:

public function testGetThingUsesDaoPassedIntoConstructor()
{
    $this->testGetThingReturnsCorrectThingForId();
}

But I think that's too pedantic even for me. I think we can generally just take it as given that the existing test covers us.

I'll close by saying I do think this is good advice when starting to do the testing for a code change to first come up with the test case name. It gets one into the correct mindset for approaching things in a TDD manner. But one oughtn't be as dogmatic about this as I clearly tend to be... existing tests quite possible could provide the TDD-oriented cover we need for code changes. We do still need to start with a breaking test, and the test does actually need to cover the case for the change... but it might be an existing test.

Oh and making the code change then fixing the tests that break as a result is not TDD. That's still wrong ;-)

I'm dead keen to hear what anyone might have to say about all this...

Righto.

--
Adam

↧

PHPUnit: trying to be clever with data providers and failing

December 20, 2019, 6:42 am

≫ Next: Brian

≪ Previous: I need to backtrack on some TDD guidance. I think

G'day:
Yesterday we had a seemingly strange situation with a test one of my colleagues had written. It was behaving as if PHPUnit's exactly matcher was not working. The developer concerned was in the process of concluding that that is exactly what the problem is: PHPUnit. I suggested we pause and back-up a bit: given thousands of people uses PHPUnit and the exactly matcher every day (including ourselves) and have not reported this issue, whereas the code we are running it with was freshly written and not used by anyone yet... almost certainly the problem lay with our code.

Before I get on with what the problem was, this bears repeating:

If you are doing something new and it's going wrong, the problem will almost certainly not be related to any established third-party tools you are using. It'll be down to how you're using them.

I find it's a really helpful mindset when troubleshooting problems to just own the issue from the outset, rather than looking around for something else to blame. And if you've looked and looked and can't see the problem with your own code, this probably just means you haven't spotted the problem yet. Absence of evidence is not evidence of absence.

Anyway, the issue at hand. Here's a pared-down repro of the situation. The code is meaningless as I've removed all business context from it, but we have this:

class MyService
{
    private $dependency;

    function __construct(MyDependency $dependency){
        $this->dependency = $dependency;
    }

    function myMethod() {
        return $this->dependency->someMethod();
    }
}

And MyDependency is this:

class MyDependency
{
    function someMethod(){
        return "ORIGINAL VALUE";
    }
}

myMethod is new, so the dev had written a test for this:

class MyServiceTest extends TestCase
{

    /** @dataProvider provideTestCasesForMyMethodTests */
    public function testMyMethodViaDataProvider($dependency)
    {
        $dependency
            ->expects($this->once())
            ->method('someMethod')
            ->willReturn("MOCKED VALUE");

        $service = new MyService($dependency);

        $result = $service->myMethod();

        $this->assertEquals("MOCKED VALUE", $result);
    }

    public function provideTestCasesForMyMethodTests(){
        return [
            'it should call someMethod once' => [
                'dependency' => $this->createMock(MyDependency::class)
            ]
        ];
    }
}

I hasten to add that it's completely daft to have the data provider in this example. However in the real-world example there were valid test variations, and given the method being tested was a factory method, part of the test was to check that the relevant method had been called on the relevant object being returned by the factory method. The data provider was legit.

This test passes:

PS C:\src\php\general> vendor\bin\phpunit .\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php
PHPUnit 8.5.0 by Sebastian Bergmann and contributors.

. 1 / 1 (100%)

Time: 77 ms, Memory: 4.00 MB

OK (1 test, 1 assertion)
PS C:\src\php\general>

However when we were trouble shooting something, we ended up with this assertion:

$dependency
    ->expects($this->exactly(10))
    ->method('someMethod')
    ->willReturn("MOCKED VALUE");

And the test still passed:

Whut the?

When we were looking at this I asked my colleague to simplify their tests a bit. The IRL version was testing this case via a data provider and test that was testing a bunch of things in a method that had some complexity issues, and I wanted to pare the situation back to just one simple test, testing this one thing. We ended up with this test:

public function testMyMethodViaInlineMock()
{
    $dependency = $this->createMock(MyDependency::class);
    $dependency
        ->expects($this->once())
        ->method('someMethod')
        ->willReturn("MOCKED VALUE");

    $service = new MyService($dependency);

    $result = $service->myMethod();

    $this->assertEquals("MOCKED VALUE", $result);
}

The difference here being we are not using a data provider any more, the mocking of the dependency is in the test. And this passes:

PS C:\src\php\general> vendor\bin\phpunit .\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php PHPUnit 8.5.0 by Sebastian Bergmann and contributors.

. 1 / 1 (100%)

Time: 61 ms, Memory: 4.00 MB

OK (1 test, 2 assertions)
PS C:\src\php\general>

Good, fine, but now we tested out the ->expects($this->exactly(10)) issue again:

PS C:\src\php\general> vendor\bin\phpunit .\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php PHPUnit 8.5.0 by Sebastian Bergmann and contributors.

F 1 / 1 (100%)

Time: 102 ms, Memory: 4.00 MB

There was 1 failure:

1) me\adamcameron\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest::testMyMethodViaInlineMock
Expectation failed for method name is "someMethod" when invoked 10 time(s).
Method was expected to be called 10 times, actually called 1 times.

FAILURES!
Tests: 1, Assertions: 2, Failures: 1.
PS C:\src\php\general>

This is more like it. But... erm... why?

I'd had "unexpected" behaviour previously when referring to $this inside a data provider method, because the data provider method is not called on the same instance of the test class that the tests are run with (why? Buggered if I know, but it is), and had my suspicions that this could be contributing to it. This is when we just shrugged and left the test as it was - working now - and I undertook to investigate some more later. The code I've been sharing here is the start of my investigation. I was at this point none the wiser though.

On a whim I tried another test variation, to see if there was any change in behaviour across the threshold of the count (so checking "off by one" situations). Here I have a new method that calls the same dependency method five times:

function myOtherMethod(){
$this->dependency->someOtherMethod();
$this->dependency->someOtherMethod();
$this->dependency->someOtherMethod();
$this->dependency->someOtherMethod();
    return  $this->dependency->someOtherMethod();
}

And a test for it that just checks what happens when we expect it to be called 3-7 times:


/** @dataProvider provideTestCasesForMyOtherMethodTests */
public function testMyOtherMethodCallsDependencyMethodEnoughTimes($dependency, $expectedCount) {
    $dependency
        ->expects($this->exactly($expectedCount))
        ->method('someOtherMethod')
        ->willReturn("MOCKED VALUE");

    $service = new MyService($dependency);

    $result = $service->myOtherMethod();

    $this->assertEquals("MOCKED VALUE", $result);
}

public function provideTestCasesForMyOtherMethodTests(){
    return [
        'it expects myOtherMethod to be called 3 times' => [
            'dependency' => $this->createMock(MyDependency::class),
            'expectedCount' => 3
        ],
        'it expects myOtherMethod to be called 4 times' => [
            'dependency' => $this->createMock(MyDependency::class),
            'expectedCount' => 4
        ],
        'it expects myOtherMethod to be called 5 times' => [
            'dependency' => $this->createMock(MyDependency::class),
            'expectedCount' => 5
        ],
        'it expects myOtherMethod to be called 6 times' => [
            'dependency' => $this->createMock(MyDependency::class),
            'expectedCount' => 6
        ],
        'it expects myOtherMethod to be called 7 times' => [
            'dependency' => $this->createMock(MyDependency::class),
            'expectedCount' => 7
        ]
    ];
}

All things being equal, I was expecting no failures here, given what we'd seen before. But... erm...

OK so this is a bit perplexing. When the expected call count is less than the actual call count we get it failing correcty. But when it's more... no failure. Now I'm flummoxed. For good measure I also have a test not creating the dependency in the data provider, but instead inline in the test:

/** @dataProvider provideTestCasesForMyOtherMethodTests */
public function testMyOtherMethodCallsDependencyMethodEnoughTimesWithDependencyCreatedInTest($_, $expectedCount) {
$dependency = $this->createMock(MyDependency::class);
    $dependency
        ->expects($this->exactly($expectedCount))
        ->method('someOtherMethod')
        ->willReturn("MOCKED VALUE");

    $service = new MyService($dependency);

    $result = $service->myOtherMethod();

    $this->assertEquals("MOCKED VALUE", $result);
}

And this one behaves correctly:

PS C:\src\php\general> vendor\bin\phpunit .\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php --filter=testMyOtherMethodCallsDependencyMethodEnoughTimesWithDependencyCreatedInTest PHPUnit 8.5.0 by Sebastian Bergmann and contributors.

FF.FF 5 / 5 (100%)

Time: 67 ms, Memory: 4.00 MB

There were 4 failures:

1) me\adamcameron\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest::testMyOtherMethodCallsDependencyMethodEnoughTimesWithDependencyCreatedInTest with data set "it expects myOtherMethod to be called 3 times" (Mock_MyDependency_2707ab91 Object (...), 3)
me\adamcameron\general\phpunit\expectsWithDataProviderIssue\MyDependency::someOtherMethod() was not expected to be called more than 3 times.

C:\src\php\general\src\phpunit\expectsWithDataProviderIssue\MyService.php:21
C:\src\php\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php:99

2) me\adamcameron\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest::testMyOtherMethodCallsDependencyMethodEnoughTimesWithDependencyCreatedInTest with data set "it expects myOtherMethod to be called 4 times" (Mock_MyDependency_2707ab91 Object (...), 4)
me\adamcameron\general\phpunit\expectsWithDataProviderIssue\MyDependency::someOtherMethod() was not expected to be called more than 4 times.

C:\src\php\general\src\phpunit\expectsWithDataProviderIssue\MyService.php:22
C:\src\php\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php:99

3) me\adamcameron\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest::testMyOtherMethodCallsDependencyMethodEnoughTimesWithDependencyCreatedInTest with data set "it expects myOtherMethod to be called 6 times" (Mock_MyDependency_2707ab91 Object (...), 6)
Expectation failed for method name is "someOtherMethod" when invoked 6 time(s).
Method was expected to be called 6 times, actually called 5 times.

4) me\adamcameron\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest::testMyOtherMethodCallsDependencyMethodEnoughTimesWithDependencyCreatedInTest with data set "it expects myOtherMethod to be called 7 times" (Mock_MyDependency_2707ab91 Object (...), 7)
Expectation failed for method name is "someOtherMethod" when invoked 7 time(s).
Method was expected to be called 7 times, actually called 5 times.

FAILURES!
Tests: 5, Assertions: 6, Failures: 4.
PS C:\src\php\general>

I started diving back through the PHPUnit code, and found the point where my two test variations began to demonstrate different behaviour. I edited InvocationHandler::verify to spit out some telemetry:

public function verify(): void
{
    printf(PHP_EOL . "InvocationHandler::verify called with %d matchers" . PHP_EOL, count($this->matchers));

    foreach ($this->matchers as $matcher) {
        $matcher->verify();
    }

And here's what I get running the two tests:

PS C:\src\php\general> vendor\bin\phpunit .\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php --filter=testMyMethodViaDataProvider PHPUnit 8.5.0 by Sebastian Bergmann and contributors.

. 1 / 1 (100%)
InvocationHandler::verify called with 0 matchers

Time: 60 ms, Memory: 4.00 MB

OK (1 test, 1 assertion)
PS C:\src\php\general> vendor\bin\phpunit .\test\phpunit\expectsWithDataProviderIssue\MyServiceTest.php --filter=testMyMethodViaInlineMock PHPUnit 8.5.0 by Sebastian Bergmann and contributors.

F 1 / 1 (100%)
InvocationHandler::verify called with 1 matchers

Time: 62 ms, Memory: 4.00 MB

There was 1 failure:

1) me\adamcameron\general\test\phpunit\expectsWithDataProviderIssue\MyServiceTest::testMyMethodViaInlineMock
Expectation failed for method name is "someMethod" when invoked 10 time(s).
Method was expected to be called 10 times, actually called 1 times.

FAILURES!
Tests: 1, Assertions: 2, Failures: 1.
PS C:\src\php\general>

Well. So when we run our tests using the mock created in the test code... we get our matcher "loaded" (for lack of a better term); when we run our test using the mock created in the data provider: it's nowhere to be seen. Interesting.

Looking further up the call stack, we find that this verify method is called from TestCase::verifyMockObects:

private function verifyMockObjects(): void
{
    foreach ($this->mockObjects as $mockObject) {
        if ($mockObject->__phpunit_hasMatchers()) {
            $this->numAssertions++;
        }

$mockObject->__phpunit_verify(
$this->shouldInvocationMockerBeReset($mockObject)
);
    }

(I dunno what's going on with the method name there, but I did verify the call stack... PHPUnit is doing some special magic there).

verifyMockObjects in turn is called from runBare in the same class, and that is run from TestResult::run.

The key thing here is verifyMockObjects being run in the context of the TestCase object, and it looks at $this->mockObjects. Where does $this->mockObjects come from? When we create the mock:

public function getMock(): MockObject
{
    $object = $this->generator->getMock(
        $this->type,
        !$this->emptyMethodsArray ? $this->methods : null,
        $this->constructorArgs,
        $this->mockClassName,
        $this->originalConstructor,
        $this->originalClone,
        $this->autoload,
        $this->cloneArguments,
        $this->callOriginalMethods,
        $this->proxyTarget,
        $this->allowMockingUnknownTypes,
        $this->returnValueGeneration
    );

    $this->testCase->registerMockObject($object);

    return $object;
}

// ...

public function registerMockObject(MockObject $mockObject): void
{
    $this->mockObjects[] = $mockObject;
}

(remember that our createMock call is just a wrapper for a getMockBuilder()...->getMock() call).

So what's the bottom line here? Well my initial wariness about creating the mock in the data provider method has legs... when a mock is created it's registered in the TestCase object it's created within... and the one that the data provider method is running in is not the same TestCase as the one the test is run in. So... none of the mock's constraints are checked when the test is run. That's mostly it.

But what's the story with how that last test still failed on the variants with only 3/5 and 4/5 calls made? That check is done when the mocked method is actually called. So given this:

function myMethod() {
    return $this->dependency->someMethod();
}

When the mocked someMethod is called, a check is made then as to whether it's been called too many times for the expectation. So that's one within the mock itself at the time it's executed, hence it still "works". It's just the check that the exact number of call times was hit that is done at test level, not mock level if that makes sense.

This all just goes to show that one should just use data provider methods for providing data values for test variations, not to treat them like some sort of looping mechanism that the test is just a callback for. That's not the intent, and when one tries to be tricky... one is quickly found out to not be as clever as one hopes... ;-)

That's an awful lot of carry on for what ended up to be an easy / simple explanation. I'm mostly writing this up in case anyone else gets the same issue, and also to show how I went about troubleshooting and investigating it. I'll be putting this in front of my team shortly...

Righto.

--
Adam

↧

Brian

April 9, 2020, 4:33 am

≫ Next: Aaaaah... *Me*

≪ Previous: PHPUnit: trying to be clever with data providers and failing

G'day:
So this really sux.

For the bulk of the last decade I've been working alongside Brian Sadler. We first worked with each other when he joined hostelbookers.com when I'd been there for a coupla years, back in fuck-knows-when. I can recall thinking "shit... someone as old as I am, this should be interesting", in a room of devs (and managers) who were between 5-15 years younger than the both of us. Not that chronological experience necessarily means anything in any way that talent can be measured, but I had been used to being the oldest and most experienced geezer in the teams I'd been involved in. I've always been old, basically. So this new "Brian" guy seemed an interesting colleague to acquire.

My instinct was right.

[I've typed-in a paragraph of over-written shite four times now, and deleted the lot. I need to get to the point]

I have been a reasonably good programmer, technically. I'm OK with saying that.

What I've learned from Brian is that is only a small part of being a good team operative. I've always known and respected that programming is an act of communicating with humans, not the computer, but Brian really drove this home to me.

He introduced me to the concept of Clean Code.

He introduced me to the concept of TDD.

He schooled me in various concepts of refactoring, in that latter stage of Red Green Refactor. I'm still reading Martin Fowler's "Refactoring" as a result.

Every time I am looking at a code problem and I know I am not quite getting it, I hit him up and say "right, come on then... what am I missing...?" and he'll pull out a design pattern, or some OOP concept I'd forgotten about, or just has the ability to "see" what I can't in a code conundrum, and explain it to me.

Every time I ask "how can we make this code easier for our team to work with in future?", Brian's had the answer.

For a few years Brian's has been our Agile advocate, and listening to him and following his guidance has been an immeasurable benefit to my capabilities and my work focus.

I've also seen Brian help everyone else in my immediate team; other teams; and our leaders.

He's probably the most significant force for good I have had in my career.

He's also a really fuckin' good mate, for a lot of reasons that are not relevant for this sort of environment.

For reasons that are also irrelevant to this blog, today was the last day for the time being that Brian and I will be working with each other, and I am genuinely sad about that.

Genuinely sad.

Thanks bro.

--
Adam

↧

Aaaaah... Me

April 9, 2020, 4:34 am

≫ Next: Ben Brumm writes an interesting article on Hierarchical Data in SQL

≪ Previous: Brian

G'day
OK to continue this occasional series of commenting on ppl who have impacted my tenure at work, as they leave. Time to do another one.

Adam Cameron.

What a prick. Always nagging about doing TDD, always being pedantic about code reviews. Banging on about clean code. Asking for stuff to be refactored and simplified. Being stroppy. Then being even more stroppy.

So it's a bloody relief that after 10 years he's finally leaving, and getting out of everyone's hair.

Cameron is reluctantly leaving behind a whole bunch of people who have helped him grow as a professional (one way or another), enriched his work experience and who he genuinely admires. Both professionally and personally. He'll probably stay in touch with a bunch of them. And knowing him he'll be showing up in Porto to have beers with his squad as soon as he can.

He's been in a coupla good squads in his time, but this one that he's been leading for the last year is the best. He was lucky to get that lot as his first team as "Tech Lead". I doubt he'd've turned out reasonably OK at that role if it wasn't for them. The good thing is they hardly need leading, so will continue to do their excellent work even without him around.

10 years. Fuck me.

See ya.

--
Adam

↧

Ben Brumm writes an interesting article on Hierarchical Data in SQL

October 16, 2020, 4:49 am

≫ Next: ColdFusion: looking at an issue Mingo had with ehcache and cachePut and cacheGet

≪ Previous: Aaaaah... *Me*

G'day:

This is solely a heads-up regarding some potential reading for you.

A bloke called Ben Brumm came across one of my old articles, "CFML: an exercise in converting a nested set tree into an adjacency list tree". He's hit me up and asked me to add a link to that to an article he's written "Hierarchical Data in SQL: The Ultimate Guide". I've done that, but that article is buried in the mists of time back in 2015. I figured he could benefit from a more contemporary nudge too. So here it is. Me nudging.

In other news people have been urging me to get back to this blog, but I'm afraid I still am not finding anything new to write about. So... erm... yeah. If something leaps out at me I do fully intend to come back to this, but... not for now.

Righto.

Adam

↧

ColdFusion: looking at an issue Mingo had with ehcache and cachePut and cacheGet

October 23, 2020, 7:15 am

≫ Next: Quickly running ColdFusion via Docker

≪ Previous: Ben Brumm writes an interesting article on Hierarchical Data in SQL

G'day:

Bloody Coldfusion. OK so why am I writing about a ColdFusion issue? Well about 80% of it is "not having a great deal else to do today", about 10% of being interested in this issue Mingo found. And 10% it being an excuse to mess around with Docker a bit. I am currently - and slowly - teaching myself about Docker, so there's some practise for me getting a ColdFusion instance up and running on this PC (which I no-longer have any type of CFML engine installed on).

OK so what's the issue. Mingo posted this on Twitter:

Just in case you wanna run that code, here it is for copy and paste:

foo = { bar = 1 };
cachePut( 'foobar', foo );

foo.bar = 2;

writeDump( cacheGet( 'foobar' ) );

Obviously (?) what one would expect here is 1. What gets put into cache would be a copy of the struct right?

Well... um... here goes...

/opt/coldfusion/cfusion/bin $ ./cf.sh
ColdFusion started in interactive mode. Type 'q' to quit.
cf-cli>foo = { bar = 1 };
struct
BAR: 1

cf-cli>cachePut( 'foobar', foo );
cf-cli>foo.bar = 2;
2
cf-cli>writeDump( cacheGet( 'foobar' ) );
struct
BAR: 2

cf-cli>

... errr... what?

It looks like ColdFusion is like only putting a reference to the struct into cache. So any code changing the data in the struct is changing it in CFML as well as changing it in cache. This does not seem right.

I did a quick google and found a ticket in Adobe's system about this: CF-3989480 - cacheGet returns values by reference. The important bit to note is that it's closed with

Not a great explanation from Adobe there, fortunately Rob Brooks-Bilson had commented further up with a link to a cached version of an old blog article of his, explaining what's going on: "Issue with Ehcache and ColdFusion Query Objects". It's a slightly different situation, but it's the same underlying "issue". Just to copy and paste the relevant bit from his article:

Update: It looks like this is actually expected behavior in Ehcache. Unfortunately, it's not documented in the ColdFusion documentation anywhere, but Ehcache actually has two configurable parameters (as of v. 2.10) called copyOnRead and copyOnWrite that determine whether values returned from the cache are by reference or copies of the original values. By default, items are returned by reference. Unfortunately we can't take advantage of these parameters right now as CF 9.0.1 implements Ehcache 2.0.

I decided to have a look what we could do about this on ColdFusion 2018, hoping that its embedded implementation of Ehcache has been updated since Rob wrote that in 2010.

Firstly I checked the Ehcache docs for these two settings: copyOnWrite and copyOnRead. This is straight forward (from "copyOnRead and copyOnWrite cache configuration"):

<cache name="copyCache"
    maxEntriesLocalHeap="10"
    eternal="false"
    timeToIdleSeconds="5"
    timeToLiveSeconds="10"
copyOnRead="true"
copyOnWrite="true">
<persistence strategy="none"/>
<copyStrategy class="com.company.ehcache.MyCopyStrategy"/>
</cache>

The docs also confirm these are off by default

Next where's the file?

/opt/coldfusion $ find . -name ehcache.xml
./cfusion/lib/ehcache.xml
/opt/coldfusion $

Cool. BTW I just guessed at that file name.

So in there we have this (way down at line 471):

<!--
Mandatory Default Cache configuration. These settings will be applied to caches
created programmtically using CacheManager.add(String cacheName).

The defaultCache has an implicit name "default" which is a reserved cache name.
-->
<defaultCache
    maxElementsInMemory="10000"
    eternal="false"
    timeToIdleSeconds="86400"
    timeToLiveSeconds="86400"
    overflowToDisk="false"
    diskSpoolBufferSizeMB="30"
    maxElementsOnDisk="10000000"
    diskPersistent="false"
    diskExpiryThreadIntervalSeconds="3600"
    memoryStoreEvictionPolicy="LRU"
    clearOnFlush="true"
    statistics="true"
/>

That looked promising, so I updated it to use copyOnWrite:


<defaultCache
    maxElementsInMemory="10000"
    eternal="false"
    timeToIdleSeconds="86400"
    timeToLiveSeconds="86400"
    overflowToDisk="false"
    diskSpoolBufferSizeMB="30"
    maxElementsOnDisk="10000000"
    diskPersistent="false"
    diskExpiryThreadIntervalSeconds="3600"
    memoryStoreEvictionPolicy="LRU"
    clearOnFlush="true"
    statistics="true"
copyOnWrite="false"
/>

Whatever I put into cache, I want it to be decoupled from the code immediately, hence doing the copy-on-write.

I restarted CF and ran the code again:

/opt/coldfusion/cfusion/bin $ ./cf.sh
ColdFusion started in interactive mode. Type 'q' to quit.
cf-cli>foo = { bar = 1 };
struct

BAR: 1

cf-cli>cachePut( 'foobar', foo );
cf-cli>foo.bar = 2;
2

cf-cli>writeDump( cacheGet( 'foobar' ) );

struct
BAR: 1

cf-cli>

Yay! We are getting the "expected" result now: 1

Don't really have much else to say about this. I'm mulling over writing down what I did to get ColdFusion 2018 up and running via Docker instead of installing it. Let's see if I can be arsed...

Righto.

Adam

↧

Quickly running ColdFusion via Docker

October 23, 2020, 11:52 am

≫ Next: Getting Docker, Clojure and VSCode all playing nice (well... almost...)

≪ Previous: ColdFusion: looking at an issue Mingo had with ehcache and cachePut and cacheGet

G'day:
Firstly, don't take this as some sort of informed tutorial as to how things should be done. I am completely new to Docker, and don't yet know my arse from my elbow with it. Today I decided to look at an issue my mate was having with ColdFusion (see "ColdFusion: looking at an issue Mingo had with ehcache and cachePut and cacheGet"). These days I do not have CF installed anywhere, so to look into this issue, I needed to run it somehow. I was not gonna do an actual full install just for this. Also I've been trying - slowly - to get up to speed with Docker, and this seemed to be a sitter for doing something newish with it. Also note that this is not the sort of issue I could just use trycf.com to run some code, as it required looking at some server config.

All I'm gonna do here is document what I had to do today to solve the problem in that other article.

I am running on Windows 10 Pro, I have WSL installed, and I'm running an Ubuntu distribution, and using bash within that. I could use powershell, but I so seldom use it, I'm actually slightly more comfortable with bash. For general Windows command-line stuff I usually just use the default shell, and it didn't even occur to me until just now to check if the docker CLI actually works in that shell, given all the docs I have read use either Powershell or some *nix shell. I just checked and seems it does work in the default shell too. Ha. Ah well, I'm gonna stick with bash for this.

I've also already got Docker Desktop & CLI installed and are running. I've worked through the Getting Started material on the Docker website, and that's really about it, other than a coupla additional experiments.

Right. I'd been made aware that Adobe maintain some Docker images for ColdFusion, but no idea where they are or that sort of jazz, so just googled it ("docker coldfusion image"). The first match takes me to "Docker images for ColdFusion", and on there are the exact Docker commands to run CF / code in CF in various ways. What I wanted to do was to run the ColdFusion REPL, and mess around with its config files. The closest option to what I wanted was to run ColdFusions built-in web server to serve a website. I figured it'd use an image with a complete ColdFusion install, so whilst I didn't want or need the web part of it, I could SSH into the container and use the REPL that way. So the option for me was this one:

Run ColdFusion image in daemon mode (that link is just a deep link to the previous one above). That gives the command:

docker run -dt-p 8500:8500-v c:/wwwroot/:/app \
-e acceptEULA=YES -e password=ColdFusion123 -e enableSecureProfile=true \
eaps-docker-coldfusion.bintray.io/cf/coldfusion:latest

(I've just split that over multiple lines for readability).

Here we're gonna be running the container detached (-d), so once the container runs, control returns to my shell. I'm not at all clear why they're using the pseudo-TTY option here (-t / --tty). From all my reading (basically this Q&A on Stack Overflow: "Confused about Docker -t option to Allocate a pseudo-TTY"), it's more for when not in detached mode?

We're also exposing (publishing) the container's port 8500 to the host. IE: when I make HTTP requests to port 8500 in my PC's browser, it will be routed to the container (and CF's inbuilt web server in the container will be listening).

Lastly on that line we're mounting C:\wwwroot as a volume (-v) in the container as /app. IE: files in my PC's C:\wwwroot dir will be availed within the container in the /app dir.

The second line is just a bunch of environment variables (-e) that get set in the container... the ColdFusion installation process are expecting these I guess.

And the last line is the address of where the CF image is.

I'm not going to use exactly this command actually. I'm on Ubuntu so I don't have a C:\wwwroot, and I don't even care what's in the web root for this exercise (I'm just wanting the REPL, remember), so I'm just going to point it to a temp directory in my home dir (~/tmp). Also I see no reason to enable the secure profile. That's just a pain in the arse. Although it would probably not matter one way or the other.

Also I want to give my container a name (--name) for easy reference.

So here's what I've got:

docker run --name cf -dt -p 8500:8500 -v ~/tmp:/app \
-e acceptEULA=YES -e password=ColdFusion123 -e enableSecureProfile=false \
eaps-docker-coldfusion.bintray.io/cf/coldfusion:latest

Let's run that:

adam@DESKTOP-QV1A45U:~$ docker run --name cf -dt -p 8500:8500 -v ~/tmp:/app \
> -e acceptEULA=YES -e password=ColdFusion123 -e enableSecureProfile=false \
> eaps-docker-coldfusion.bintray.io/cf/coldfusion:latest
Unable to find image 'eaps-docker-coldfusion.bintray.io/cf/coldfusion:latest' locally
latest: Pulling from cf/coldfusion
b234f539f7a1: Pull complete
55172d420b43: Pull complete
5ba5bbeb6b91: Pull complete
43ae2841ad7a: Pull complete
f6c9c6de4190: Pull complete
b3121a6492e2: Pull complete
cf911c0e9bab: Pull complete
ef2cca3cfd53: Pull complete
f7a96a442bce: Pull complete
5f2632eaddf0: Pull complete
f4563a5d4acb: Pull complete
ad1e188590ee: Pull complete
Digest: sha256:12e7dc19bb642a0f67dba9a6d0a2e93f805de0a968c6f92fdcb7c9c418e1f1e8
Status: Downloaded newer image for eaps-docker-coldfusion.bintray.io/cf/coldfusion:latest
c7b703635a5358f0ea91c3f2549d8ec4a308812c44b2476bd6ad4c00b2e32c12
adam@DESKTOP-QV1A45U:~$

This takes a frew minutes to run cos the ColdFusion image is 570MB and it also downloads a bunch of other stuff. As it goes, you see progress like this:

...
ef2cca3cfd53: Pull complete
f7a96a442bce: Downloading [=================================================> ] 560.9MB/570.4MB
f7a96a442bce: Pull complete
...

In theory ColdFusion should not be installed and listening on http://localhost:8500, and...

Cool. I've had some issues running the REPL before if the CFAdmin installation step hasn't been run at least once, so I let that go (the password is the one from the environment variable, yeah? ColdFusion123).

I can see the image and the container in docker now too:

adam@DESKTOP-QV1A45U:~$
adam@DESKTOP-QV1A45U:~$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
eaps-docker-coldfusion.bintray.io/cf/coldfusion latest 31a6d984cf30 2 months ago 1.03GB
adam@DESKTOP-QV1A45U:~$
adam@DESKTOP-QV1A45U:~$
adam@DESKTOP-QV1A45U:~$ docker container ls
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
c7b703635a53 eaps-docker-coldfusion.bintray.io/cf/coldfusion:latest "sh /opt/startup/sta…" 10 minutes ago Up 10 minutes (healthy) 8016/tcp, 45564/tcp, 0.0.0.0:8500->8500/tcp cf
adam@DESKTOP-QV1A45U:~$
adam@DESKTOP-QV1A45U:~$

Next we need to SSH into the container. For that I'm using some geezer called Jeroen Peters' docker-ssh container. I found this from a bunch of googling, and people saying how not do use SSH from within containers, and instead have the SSH server in a container of its own. Seems legit.

I've modified his docker command slightly to a) point to my cf container, also to detach once it's started:

adam@DESKTOP-QV1A45U:~$
adam@DESKTOP-QV1A45U:~$ docker run --detach -e FILTERS={\"name\":[\"^/cf$\"]} -e AUTH_MECHANISM=noAuth \
--name sshd-web-server1 -p 2222:22 --rm \
-v /var/run/docker.sock:/var/run/docker.sock \
jeroenpeeters/docker-ssh
5447950ce4d4d7ced8ef61696a64b8af9c4c01139c4fde58fa558ef25ba89b58
adam@DESKTOP-QV1A45U:~$
adam@DESKTOP-QV1A45U:~$

I can now SSH into the container:

adam@DESKTOP-QV1A45U:~$ ssh localhost -p 2222

###############################################################
## Docker SSH ~ Because every container should be accessible ##
###############################################################
## container | /cf ##
###############################################################

/opt $

I'm in! Cool!

Just to prove to myself further that I am actually "inside" the container, I jumped over to another bash session on my PC, and stuck a new file in that ~/tmp directory:

adam@DESKTOP-QV1A45U:~$ cd tmp
adam@DESKTOP-QV1A45U:~/tmp$ cat > new_file
G'day world
^Z
[2]+ Stopped cat > new_file
adam@DESKTOP-QV1A45U:~/tmp$ ll
total 20
drwxr-xr-x 3 adam adam 4096 Oct 23 19:10 ./
drwxr-xr-x 6 adam adam 4096 Oct 23 19:09 ../
drwxr-xr-x 4 999 999 4096 Oct 23 18:25 WEB-INF/
-rwxr-xr-x 1 999 root 370 Oct 23 18:24 crossdomain.xml*
-rw-r--r-- 1 adam adam 12 Oct 23 19:10 new_file
adam@DESKTOP-QV1A45U:~/tmp$
adam@DESKTOP-QV1A45U:~/tmp$ cat new_file
G'day world
adam@DESKTOP-QV1A45U:~/tmp$

And checked it from within the container:

/opt $ cd /app
/app $ # before

/app $ ll
total 16
drwxr-xr-x 3 1000 1000 4096 Oct 23 18:09 ./
drwxr-xr-x 1 root root 4096 Oct 23 17:24 ../
drwxr-xr-x 4 cfuser cfuser 4096 Oct 23 17:25 WEB-INF/
-rwxr-xr-x 1 cfuser root 370 Oct 23 17:24 crossdomain.xml*
/app $
/app $ # after
/app $ ll
total 20
drwxr-xr-x 3 1000 1000 4096 Oct 23 18:10 ./
drwxr-xr-x 1 root root 4096 Oct 23 17:24 ../
drwxr-xr-x 4 cfuser cfuser 4096 Oct 23 17:25 WEB-INF/
-rwxr-xr-x 1 cfuser root 370 Oct 23 17:24 crossdomain.xml*
-rw-r--r-- 1 1000 1000 12 Oct 23 18:10 new_file
/app $
/app $ cat new_file

G'day world

/app $

OK so I think I'm happy with that. So... now to run the code...

/app $ cd /opt/coldfusion/cfusion/bin/
/opt/coldfusion/cfusion/bin $ ./cf.sh
ColdFusion started in interactive mode. Type 'q' to quit.
cf-cli>foo = { bar = 1 };
cachePut( 'foobar', foo );

foo.bar = 2;

writeDump( cacheGet( 'foobar' ) );
struct

BAR: 1

cf-cli>
cf-cli>cf-cli>2
cf-cli>cf-cli>struct

BAR: 2

cf-cli>^Z
[1]+ Stopped ./cf.sh
/opt/coldfusion/cfusion/bin $

All this is discussed in the previous article (ColdFusion: looking at an issue Mingo had with ehcache and cachePut and cacheGet). This is showing the problem though... the answer should be 1 not 2

I have to fix this cache setting...

/opt/coldfusion/cfusion/bin $ cd ../lib
/opt/coldfusion/cfusion/lib $ ll ehcache.xml
-rwxr-xr-x 1 cfuser bin 25056 Jun 25 2018 ehcache.xml*
/opt/coldfusion/cfusion/lib $ vi ehcache.xml
bash: vi: command not found
/opt/coldfusion/cfusion/lib $

Erm... wat??

Wow OK so this is a really slimmed down container (except the 100s of MB of ColdFusion stuff I mean). I can fix this...

/opt/coldfusion/cfusion/lib $ sudo apt-get update
bash: sudo: command not found
/opt/coldfusion/cfusion/lib $

WAAAAAH. OK this is getting annoying now. On a whim I just tried without sudo

/opt/coldfusion/cfusion/lib $ apt-get update
Get:1 http://archive.ubuntu.com/ubuntu xenial InRelease [247 kB]
[etc]
Fetched 29.4 MB in 10s (2758 kB/s)
Reading package lists... Done
/opt/coldfusion/cfusion/lib $
/opt/coldfusion/cfusion/lib $ apt-get install vi
Reading package lists... Done
Building dependency tree
Reading state information... Done
E: Unable to locate package vi
/opt/coldfusion/cfusion/lib $

Fine. One step forward, one step sideways. Grumble. Another whim:

/opt/coldfusion/cfusion/lib $ apt-get install vim
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
[heaps of stuff]
/opt/coldfusion/cfusion/lib $

Ooh! Progress. But... does it work? (hard to get proof of this as vi takes over the shell, but... yes it worked).

I now just have to restart the CF container, and test the fix:

/opt/coldfusion/cfusion/lib $ exit
exit
There are stopped jobs.
/opt/coldfusion/cfusion/lib $ exit
exit
Connection to localhost closed.
adam@DESKTOP-QV1A45U:~$ docker stop cf
cf
adam@DESKTOP-QV1A45U:~$ docker start cf
cf
adam@DESKTOP-QV1A45U:~$ ssh localhost -p 2222
###############################################################
## Docker SSH ~ Because every container should be accessible ##
###############################################################
## container | /cf ##
###############################################################

/opt $ cd coldfusion/cfusion/bin/
/opt/coldfusion/cfusion/bin $ ./cf.sh
ColdFusion started in interactive mode. Type 'q' to quit.
cf-cli>foo = { bar = 1 };
cachePut( 'foobar', foo );

foo.bar = 2;

writeDump( cacheGet( 'foobar' ) );
struct

BAR: 1

cf-cli>
cf-cli>cf-cli>2
cf-cli>cf-cli>struct

BAR: 1

cf-cli>

WOOHOO! I managed to do something useful with Docker (and then reproduce it all again when writing this article). Am quite happy about that.

Final thing... if yer reading this and are going "frickin' hell Cameron, you shouldn't do [whatever] like [how I did it]", please let me know. This is very much a learning exercise for me.

Anyhow... this is the most work I've done in about four months now, and I need a... well... another beer.

Righto.

--
Adam

↧

Getting Docker, Clojure and VSCode all playing nice (well... almost...)

December 1, 2020, 3:45 pm

≫ Next: Docker: resolving "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" issue

≪ Previous: Quickly running ColdFusion via Docker

G'day:
I'm trying to re-motivate myself to do some tech type stuff in my spare time. I have a lot of spare time at the moment. 24h of the day. I still haven't really looked for a job having been made redundant a few months back (coughJunecough), and I've been largely wasting my time sleeping, napping (yes, two distinct things), and playing computer games. On one hand 2020 is a bit of a rubbish year, and a new work environment is likely to be a subpar experience, even if I can find work. On the other hand I can afford to not work for quite a while thanks to my redundancy pay out, but on still another hand (yes, fine, I have three hands for the purposes of this story), I really am not taking advantage of this spare time I have. That's why I had a look at Mingo's CF problem the other day ("looking at an issue Mingo had with ehcache and cachePut and cacheGet"). Just trying to do anything marginally "productive".

I've been wanting to learn a bit of Clojure for a while now. Not for any particular reason, and I'm not looking to shift to doing that over PHP, but just cos it's a completely different language from anything I've used before - which have all been C-ish "curly brace" type things - and my mate Sean has banged on about how good it is. I think at his suggestion I bought Living Clojure by Carin Meier a while back when I had some down time in NZ. I read the first few pages and then realised Auckland has a lot of pubs, so did that instead. This would have been about four(?) years ago. I ain't touched it since.

Here I am, sporadically trying to get to grips with Docker, and this seemed like a good mini-project... get Clojure up and running within a container, set up a project, integrate it with an IDE, and get to the point where I have a running (and passing) test for some sort of "G'day world" function. In the past I would install whatever language I was wanting to mess with natively on my PC, but I decided running it in a container is the way to go.

Note that this is pretty much just a record of what I needed to do to get to the point I could write the code and run the tests. I have very little experience with Docker, and hardly any experience with Clojure. What this means is that this is def a newbie account of what I did, and I really would not take it as instructions of how to do stuff. All I will say is that I needed to do a whole bunch of googling to work shit out, and I will summarise the results of the googling here. This is more an exercise for me to verify what I've done actually works, as having done it once... I'm now getting-rid and starting from scratch.

OK. I am starting with a minimal "empty" repo, living-clojure:

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ tree -aF --dirsfirst -L 1 .
.
├── .git/
├── .gitignore*
├── LICENSE*
└── README.md*

1 directory, 3 files
adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$

This just gives me a directory I can mount in the container to expose my code in there.

Now I need the container. Looking at dockerhub/clojure, the options there are just for running the app, not doing development, so ain't much help. I was hoping to be able to copy and paste something like "a container with Clojure and Leiningen in it wot you can as a shell for building yer project, testing it, developing it, and running it". Nothing to copy and paste and run, so I'm gonna have to work out what to do. Remember I'm a complete noob with Docker.

Getting the image down was easy enough:

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ docker pull clojure
Using default tag: latest
latest: Pulling from library/clojure
852e50cd189d: Already exists
ef17c1a94464: Pull complete
477589359411: Pull complete
0a40767d8190: Downloading [==================> ] 71.81MB/197.1MB
434967525bea: Download complete
c89e9081a4db: Download complete
b7cd84bcb910: Download complete
214999e8c5eb: Download complete
0bdd8adb02ca: Download complete

(Just showing it mid-pull there)

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
php rc-cli 657d0925a963 6 days ago 407MB
clojure latest 846a444b258f 6 days ago 586MB
[etc]

Just to start with, I just need a container based on the Clojure image, but doing nothing else. I just need to run a shell with it so I can create my Clojure project. I came up with this:

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ docker create --name living-clojure --mount type=bind,source=/mnt/c/src/living-clojure,target=/usr/src/living-clojure--workdir /usr/src/living-clojure--interactive --tty clojure /bin/bash
a1be067ee435adb6851ac06b53bb97545f38dabced73a19255f1e273b0337d1e
adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$

I'm mounting my host's project directory within the container;
and setting that to be my working dir in there too;
I don't completely get these two, but I wanna be able to type shit into my shell that I'm running, and this seems to make it happen;
and it's bash I want to interact with.

And I start this:

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ docker start --interactive living-clojure
root@a1be067ee435:/usr/src/living-clojure#

Hurrah! This seems promising: I'm in a different shell, and I'm in the working directory I specified when doing the docker create before.

Let's see if I can create a new Clojure project (which I'm lifting from the the Leiningen docs):

root@a1be067ee435:/usr/src/living-clojure# cd ..
root@a1be067ee435:/usr/src# lein new app living-clojure --force
Generating a project called living-clojure based on the 'app' template.
root@a1be067ee435:/usr/src#

I need to go up a directory level and to use --force because I already have the project directory in place, and lein new assumes one doesn't.

And this all seems to work fine (well… in that it's done "stuff"):

root@a1be067ee435:/usr/src# tree -F -a --dirsfirst -L 1 living-clojure/
living-clojure/
├── .git/
├── doc/
├── resources/
├── src/
├── test/
├── .gitignore*
├── .hgignore*
├── CHANGELOG.md*
├── LICENSE*
├── README.md*
└── project.clj*

5 directories, 6 files
root@a1be067ee435:/usr/src#

git status shows something interesting though:

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ git status
On branch main
Your branch is up to date with 'origin/main'.

Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: .gitignore
modified: LICENSE
modified: README.md

Untracked files:
(use "git add <file>..." to include in what will be committed)
CHANGELOG.md
doc/
project.clj
src/
test/

no changes added to commit (use "git add" and/or "git commit -a")
adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$

Note how .gitignore, LICENSE and README.md have all changed. Such is the price of doing that --force I guess. I'm gonna revert the LICENSE and README.md, but leave the .gitignore; I presume the Leiningen bods have a better idea of what git should ignore in the context of a Clojure project than GitHub's boilerplate one does.

Next… what's the code in the default project actually do?

(ns living-clojure.core
(:gen-class))

(defn -main
"I don't do a whole lot ... yet."
[& args]
(println "Hello, World!"))

That will do for a start. Let's see if it runs:

root@a1be067ee435:/usr/src# cd living-clojure/
root@a1be067ee435:/usr/src/living-clojure# lein run
Hello, World!
root@a1be067ee435:/usr/src/living-clojure#

Cool! Is it bad I'm so pleased with a computer saying "Hello, World!"? Ha.

I'm being a bit naughty because I'm running the code before I've tested it. But I figure... not my code, so it's OK. Anyway, there is a test, although it does not test the code in the app. Sigh. But anyhow:

root@a1be067ee435:/usr/src/living-clojure# cat test/living_clojure/core_test.clj
(ns living-clojure.core-test
(:require [clojure.test :refer :all]
[living-clojure.core :refer :all]))

(deftest a-test
(testing "FIXME, I fail."
(is (= 0 1))))
root@a1be067ee435:/usr/src/living-clojure# lein test

lein test living-clojure.core-test

lein test :only living-clojure.core-test/a-test

FAIL in (a-test) (core_test.clj:7)
FIXME, I fail.
expected: (= 0 1)
actual: (not (= 0 1))

Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
Tests failed.
root@a1be067ee435:/usr/src/living-clojure#

I guess this is in-keeping with "start with a failing test". Fair enough then. At least the test runs correctly, which is good.

The last bit of checking the baseline project is… will it compile and run as a jar?

root@a1be067ee435:/usr/src/living-clojure# lein uberjar
Compiling living-clojure.core
Created /usr/src/living-clojure/target/uberjar/living-clojure-0.1.0-SNAPSHOT.jar
Created /usr/src/living-clojure/target/uberjar/living-clojure-0.1.0-SNAPSHOT-standalone.jar
root@a1be067ee435:/usr/src/living-clojure# java -jar target/uberjar/living-clojure-0.1.0-SNAPSHOT-standalone.jar
Hello, World!
root@a1be067ee435:/usr/src/living-clojure#

Perfect.

Right. Now I actually need to write some code. I wanna change that main function to read the first argument from the command-line - if present - and say "G'day [value]" (eg: "G'day Zachary"), or if there's no argument, then stick with just "G'day World".

Given that's all very simple, I could just do it with a text editor, but I'll hook an IDE up to this lot too. I googled around and found that VSCode and the Calva Clojure plug-in. I've already set all that up and can't be arsed redoing it for the purposes of this article. One issue I had… and still have… is that Calva is supposed to be able to connect to an external REPL, instead of using its own one. Whilst experimenting trying to get this working, I tweaked my docker create statement to run a headless REPL instead of bash:

docker create --name living-clojure --mount type=bind,source=/mnt/c/src/living-clojure,target=/usr/src/living-clojure --expose 56505 -p 56505:56505 --workdir /usr/src/living-clojure --interactive --tty clojure lein repl :headless :host 0.0.0.0 :port 56505

The new stuff is as follows:

I probably don't need the --expose here for what I'm doing, but the -p exposes the container's port 56505 to the host PC (56505 has no significance… it was just one of the ports a previous REPL had used, so I copied that).
I start a headless REPL (so basically a REPL as a service, that just listens for REPL connections) that listens on that port 56505
I have to use 0.0.0.0 here because when I tried to use 127.0.0.1, that worked fine from within the container, but was no good on the host machine. It took me a bit of googling to crack that one.

I start this one without being interactive:

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ docker start living-clojure
living-clojure
adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$

Doing it this way I am exposing a REPL to other sessions, whilst still being able to bash my way along too:

adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ docker exec --interactive --tty living-clojure lein repl :connect localhost:56505
Connecting to nREPL at localhost:56505
REPL-y 0.4.4, nREPL 0.6.0
Clojure 1.10.1
OpenJDK 64-Bit Server VM 11.0.9.1+1
Docs: (doc function-name-here)
(find-doc "part-of-name-here")
Source: (source function-name-here)
Javadoc: (javadoc java-object-or-class-here)
Exit: Control+D or (exit) or (quit)
Results: Stored in vars *1, *2, *3, an exception in *e

living-clojure.core=> (println "G'day world")
G'day world
nil
living-clojure.core=> quit
Bye for now!
adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$ docker exec --interactive --tty living-clojure /bin/bash
root@222940f1eafd:/usr/src/living-clojure# ls
CHANGELOG.md LICENSE README.md doc project.clj resources src target test
root@222940f1eafd:/usr/src/living-clojure# exit
exit
adam@DESKTOP-QV1A45U:/mnt/c/src/living-clojure$

I could get VSCode on my desktop PC to see this REPL:

But I could not get it to do stuff like running tests, which it was able to do using the built-in REPL. So that sucked. I googled a bit and I am not the only person with this problem when using external REPLs, so I have just decided to give up on that. I'll cheat and use the built-in REPL.

Oh god. I now need to write some Clojure code (I've now caught up with my previous progress, and am typing this article "live" now… Bear with me for a bit).

Time passes…

More time passes and there is swearing…

OK, got there. I abandoned using VSCode and Calva because Calva seems to have this strange idea that when I go to delete things (like extra or misplaced parentheses), that it knows better and doesn't just do what it's frickin' told. It's probably me not knowing some special trick that people accustomed to using it know, but… for the love of god if I ask a frickin' text editor to do something in the edit window, I expect it to just do it even if I'm asking it to do something less than ideal. I will seek out a different plugin later. Once I dropped back to using Notepad++, things went better. I have done a lot of googling in the last two hours though. Sigh.

Anyhow, here are my tests:

(ns living-clojure.core-test
  (:require [clojure.test :refer :all]
    [living-clojure.core :refer :all]))

(deftest test-greet-default-behaviour
  (testing "it should return G'day world if no name is passed"
    (is (= "G'day World" (greet [])))))

(deftest test-greet-by-single-name
  (testing "it should return G'day [name] if just that name is passed"
  (is (= "G'day Zachary" (greet ["Zachary"])))))

(deftest test-greet-by-multiple-name
  (testing "it should return G'day [name] if more than one names are passed"
  (is (= "G'day Zachary" (greet ["Zachary" "Cameron" "Lynch"])))))

And here is the code:

(ns living-clojure.core
  (:gen-class))

(defn greet
  ([args]
    (str "G'day " (or (first args) "World"))))

(defn -main
  "I greet someone or everyone"
  [& args]
  (println (greet args)))

And here's the output of the tests, running the code, and compiling and running the code:

root@222940f1eafd:/usr/src/living-clojure# lein test

lein test living-clojure.core-test

Ran 3 tests containing 3 assertions.
0 failures, 0 errors.

root@222940f1eafd:/usr/src/living-clojure# lein run
G'day World

root@222940f1eafd:/usr/src/living-clojure# lein run Zachary
G'day Zachary

root@222940f1eafd:/usr/src/living-clojure# lein run Zachary Cameron Lynch
G'day Zachary

root@222940f1eafd:/usr/src/living-clojure# lein uberjar
Compiling living-clojure.core
Created /usr/src/living-clojure/target/uberjar/living-clojure-0.1.0-SNAPSHOT.jar
Created /usr/src/living-clojure/target/uberjar/living-clojure-0.1.0-SNAPSHOT-standalone.jar

root@222940f1eafd:/usr/src/living-clojure# java -jar target/uberjar/living-clojure-0.1.0-SNAPSHOT-standalone.jar
G'day World

root@222940f1eafd:/usr/src/living-clojure# java -jar target/uberjar/living-clojure-0.1.0-SNAPSHOT-standalone.jar Zachary
G'day Zachary

root@222940f1eafd:/usr/src/living-clojure# java -jar target/uberjar/living-clojure-0.1.0-SNAPSHOT-standalone.jar Zachary Cameron Lynch
G'day Zachary

root@222940f1eafd:/usr/src/living-clojure#

I got a bit snagged on the variadic parameters syntax for a while, largely cos to me that just says "it's a reference", so I need to unlearn that in this context. And then cos I've been staring at the screen too long now, I had a brain-disconnect on only the parameters of main were variadic… by the time they got to greet it was just an array (or a list or whatever Clojure calls it). So due to dumbarsey on my part I variously had the tests passing but the behaviour at runtime slightly off, or the tests failing but the program actually running correctly. What this says is that I probably should have been testing main, not greet. My bad.

OK. It's 22:30 and I've been at this on and off for quite a few hours now. I'm pretty pleased with the exercise as a whole - I learned a lot - but now I want to go shoot things. Parentheses. I want to shoot parentheses. But I guess I will settle for shooting raiders in Fall Out 4.

Righto.

--
Adam

↧

Docker: resolving "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" issue

December 9, 2020, 5:33 am

≫ Next: PHPUnit: get code coverage reporting working on PHP 8 / PHPUnit 9.5

≪ Previous: Getting Docker, Clojure and VSCode all playing nice (well... almost...)

G'day:

Firstly some framing. This morning I sat down to do some more Docker stuff, and started to get this issue again:

adam@DESKTOP-QV1A45U:/mnt/c/src/fullstackExercise/docker$ docker-compose down
Traceback (most recent call last):
File "bin/docker-compose", line 3, in <module>
File "compose/cli/main.py", line 67, in main
File "compose/cli/main.py", line 123, in perform_command
File "compose/cli/command.py", line 69, in project_from_options
File "compose/cli/command.py", line 125, in get_project
File "compose/cli/command.py", line 184, in get_project_name
File "posixpath.py", line 383, in abspath
FileNotFoundError: [Errno 2] No such file or directory
[2052] Failed to execute script docker-compose
adam@DESKTOP-QV1A45U:/mnt/c/src/fullstackExercise/docker$

(I say "again", cos I have been getting this a lot whilst working on another article that's taken me a coupla days to write so far, and won't be done until tomorrow. Note: this issue is not the one I am looking at in this article… I'm looking at that "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" issue.)

OK, so enough context. Back to "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?".

From a lot of googling, most suggestions start with a de-install/re-install Docker Desktop, so decided to try that first. It's important to note that a lot of people had tried this, it had worked for a while, but then it started again. I think it's addressing a symptom not the problem, but I might as well be on the most recent version of Docker Desktop anyhow.

After doing the upgrade I happened to have PhpStorm open, and I absent-mindedly clicked the "refresh" button on the config for the connection to Docker I had. It errored out saying "got no PHP container mate", which was true because the de-install of Docker Desktop got rid of the containers I had. To verify this, I hit the shell:

adam@DESKTOP-QV1A45U:~$ docker container ls
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
adam@DESKTOP-QV1A45U:~$

Erm… OK, that's new. now I had been messing with the Docker daemon settings in Docker Desktop trying to get PhpStorm connecting, so went back in there and re-dis-enabled(!) Docker daemon with no TLS support. I mean this setting:

PhpStorm would stop connecting, but in the mean time I could get my containers up and running. I hit the shell again:

adam@DESKTOP-QV1A45U:~$ docker container ls
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
adam@DESKTOP-QV1A45U:~$

Bugger.

adam@DESKTOP-QV1A45U:~$ docker ps
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
adam@DESKTOP-QV1A45U:~$

[I said more than bugger this time]

Great. In troubleshooting one issue, I have caused another one. Excellent. Thanks Docker. Thanks me.

After a bunch of googling and restarting randomly and shaking my head at everyone's solution for this issue which seemed to largely be "switch it off and switch it back on again", etc, I landed on someone commenting "weird thing is though… it's all still fine in Powershell. It's just broke in WS". I checked:

PS C:\Users\camer> docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
PS C:\Users\camer>

Interesting. Something to do with WSL then. Then I remembered a setting I thought seemed odd in Docker Desktop when I gave it the once-over after re-install:

I did not think much of this at the time, because Ubuntu is my default WSL distro, so I figured that was fine. After all… everything had been working fine like that. On a whim, I decided to switch those two toggles back on though:

And now back to bash:

adam@DESKTOP-QV1A45U:/mnt/c/src/fullstackExercise/docker$ docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES adam@DESKTOP-QV1A45U:/mnt/c/src/fullstackExercise/docker$

Hurrah. So basically when I de-/re-installed Docker Desktop, it did not maintain its existing settings, and defaulted those two to be off? Weird, but I'm guessing so.

Anyway, no answer I saw via my googlings had this particular solution, so I figrued I'd write it up. This might be obvious to everyone who already knows, but there's enough people out there who clearly don't already know for it to possibly be useful. Perhaps. I'm also not suggesting it's the solution to everyone getting this "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" issue, but it's something to check. Ah anyway it's a far easier and quicker article to write than the one I'd been working on. Quick win 'n' all. I'll have the longer article done tomorrow I hope.

Righto.

--
Adam

↧

PHPUnit: get code coverage reporting working on PHP 8 / PHPUnit 9.5

December 10, 2020, 3:44 am

≫ Next: Creating a web site with Vue.js, Nginx, Symfony on PHP8 & MariaDB running in Docker containers - Part 1: Intro & Nginx

≪ Previous: Docker: resolving "Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?" issue

G'day:

I'm still beavering away getting a PHP dev environment working via Docker. As I mentioned in the previous article, I'm writing an article about how I'm doing that as I go, but I've had a few issues today that have made me pause that.

My current issue has been getting PHPUnit working properly. This has nothing to do with the Docker environment (as far as I can tell), it's just... me and PHPUnit not playing nicely together. The solution to my issue was not immediately apparent to me... it's taken me about three hours to sort this out. So I figured it might be worthwhile writing down.

I have a barebones "G'day World" website running in a coupla Docker containers (one for Nginx, one for PHP). That's all working fine, I can get "G'day world" being delivered to the browser via both HTML and PHP requests. Before I start writing any really-really code, I decided I better get PHPUnit working. TDD 'n' all that. I will admit I felt dirty yesterday when I wrote the gdayWorld.html and gdayWorld.php and ran them without tests; I just browsed to them and went "seems legit". Not the way I like to operate.

I sat down to write some tests. My gdayWorld.php is... basic:

<?php

$message = "G'day World";
echo $message;

I don't have any framework or anything installed as yet, so that's as good as it gets. That is in the web root. To test that I'd be needing to test that by curling it and checking the response, then I remembered Symfony has a WebTestCase for this sort of thing, then I decided I could not be arsed looking into that right now. All I need is a test to run to check PHPUnit is running. So... erm... here we go:

    /** @coversNothing */
    public function testNothing()
    {
        $this->assertTrue(false);
    }

This is enough to test PHPUnit. I'll write a better test once I know everything is working.

root@5fdd4abf5d20:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.

F 1 / 1 (100%)

Time: 00:00.166, Memory: 6.00 MB

There was 1 failure:

1) adamCameron\fullStackExercise\test\functional\public\WebServerTest::testNothing
Failed asserting that false is true.

/usr/share/fullstackExercise/test/functional/public/WebServerTest.php:12

FAILURES!
Tests: 1, Assertions: 1, Failures: 1.
root@5fdd4abf5d20:/usr/share/fullstackExercise#

Perfect. Now I can make it pass ;-)

$this->assertTrue(true);

root@5fdd4abf5d20:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.

. 1 / 1 (100%)

Time: 00:00.127, Memory: 6.00 MB

OK (1 test, 1 assertion)
root@5fdd4abf5d20:/usr/share/fullstackExercise#

So far, so good. My next step whilst I'm configuring stuff on PHPUnit, I decided to get the code coverage working too. I just added the code-coverage reporting bit to phpunit.xml and re-ran the tests:

root@eb28587ca66f:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.

Warning: No code coverage driver available

. 1 / 1 (100%)

Time: 00:00.210, Memory: 6.00 MB

OK (1 test, 1 assertion)
root@eb28587ca66f:/usr/share/fullstackExercise#

Doh! Forgot about that. I need to install Xdebug.

This was way more horsing around than it could have been, but this is down to the vagaries of the Docker set-up I was using. Once I decided to change direction it was a one-liner in me Dockerfile, and all good.

root@37d06fcf9209:/usr/share/fullstackExercise# php -v
PHP 8.0.0 (cli) (built: Dec 1 2020 03:33:03) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.0-dev, Copyright (c) Zend Technologies
with Xdebug v3.0.1, Copyright (c) 2002-2020, by Derick Rethans
root@37d06fcf9209:/usr/share/fullstackExercise#

Run the tests again...

root@37d06fcf9209:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.

Warning: XDEBUG_MODE=coverage or xdebug.mode=coverage has to be set

. 1 / 1 (100%)

Time: 00:00.220, Memory: 6.00 MB

OK (1 test, 1 assertion)
root@37d06fcf9209:/usr/share/fullstackExercise#

OK. Fine. Makes sense I s'pose. I added in the ini setting into my phpunit.xml:

<php>
<ini name="xdebug.mode" value="coverage" />
</php>

No difference. I figured the ini setting is probably changed too late in the piece, as the xdebug.mode will need to be set before PHP starts running any code. And perhaps the environment variable approach would work such that the actual environment variable was set before hand. As it happens it ain't - all that happens is the value is added to the $_ENV array, which is hardly the same thing. From the docs:

However this step is still a relevant one cos we see a behavioural change here:

root@37d06fcf9209:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.
Segmentation fault
root@37d06fcf9209:/usr/share/fullstackExercise#

Splat!

I'm gonna spare you a coupla hours of looking in to how to troubleshoot segmentation faults here. Nothing I read helped, and there's nothing useful relating to getting them with PHPUnit, other than a bunch of people getting these segmentation faults.

I started to wonder if it was an issue with Xdebug and PHP8, given PHP8 is so new and sometimes in the last I've noted that Xdebug has needed updating when there's a point release of PHP. But looking into this (at Xdebug: Supported Versions and Compatibility), the current version of Xdebug has been around for a while, and they say it's good to go with PHP8:

I was now thinking about versions of stuff, so I decided to rollback PHP to 7.4, and PHPUnit to 8.5. I realise I should have only done one at a time, but it did not occur to me when doing this that when troubleshooting things, only change one thing per test iteration.

root@41f51888adff:/usr/share/fullstackExercise# php -v
PHP 7.4.13 (cli) (built: Dec 1 2020 04:25:48) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
with Xdebug v3.0.1, Copyright (c) 2002-2020, by Derick Rethans
root@41f51888adff:/usr/share/fullstackExercise# vendor/bin/phpunit --version
PHPUnit 8.5.13 by Sebastian Bergmann and contributors.

root@41f51888adff:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 8.5.13 by Sebastian Bergmann and contributors.

.Code coverage needs to be enabled in php.ini by setting 'xdebug.mode' to 'coverage'
root@41f51888adff:/usr/share/fullstackExercise#

This verifies we are back to PHP 7.4 and PHPUnit 8.5. And we don't have the segmentation fault any more, but... now PHP is ignoring both of the settings in phpunit.xml. I have this in place:

<php>
<env name="XDEBUG_MODE" value="coverage"/>
<ini name="xdebug.mode" value="coverage"/>
</php>

At this point I read up on the <ini> and <env> settings in phpunit.xml, and confirmed both of them are set too late for Xdebug to read them anyhow. Quite odd the way PHPUNit 9.5 reacts to the XDEBUG_MODE being set in that case. Anyway, I need to set them before PHP is run, so I set a really-really environment variable in my Dockerfile:

ENV XDEBUG_MODE=coverage

After rebuilding my containers:

root@6053a26dc3eb:/usr/share/fullstackExercise# echo $XDEBUG_MODE
coverage
root@6053a26dc3eb:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 8.5.13 by Sebastian Bergmann and contributors.

.. 2 / 2 (100%)

Time: 1.01 seconds, Memory: 6.00 MB

OK (2 tests, 2 assertions)

Generating code coverage report in HTML format ... done [596 ms]
root@6053a26dc3eb:/usr/share/fullstackExercise#

Hurrah! Let's have a look at the report:

This is correct: I'm not testing that code in /public yet.

I also quickly tried to use the xdebug.mode ini file setting. I dropped a /usr/local/etc/php/conf.d/phpunit-code-coverage-xdebug.ini with just xdebug.mode=coverage in it into my container's file system:

root@f3cd4a0715a5:/usr/share/fullstackExercise# cat /usr/local/etc/php/conf.d/phpunit-code-coverage-xdebug.ini
xdebug.mode=coverage
root@f3cd4a0715a5:/usr/share/fullstackExercise# php --ini
Configuration File (php.ini) Path: /usr/local/etc/php
Loaded Configuration File: (none)
Scan for additional .ini files in: /usr/local/etc/php/conf.d
Additional .ini files parsed: /usr/local/etc/php/conf.d/docker-php-ext-pdo_mysql.ini,
/usr/local/etc/php/conf.d/docker-php-ext-sodium.ini,
/usr/local/etc/php/conf.d/docker-php-ext-xdebug.ini,
/usr/local/etc/php/conf.d/phpunit-code-coverage-xdebug.ini

root@f3cd4a0715a5:/usr/share/fullstackExercise# php -i | grep xdebug.mode
xdebug.mode => coverage => coverage
root@f3cd4a0715a5:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 8.5.13 by Sebastian Bergmann and contributors.

.. 2 / 2 (100%)

Time: 1.07 seconds, Memory: 6.00 MB

OK (2 tests, 2 assertions)

Generating code coverage report in HTML format ... done [869 ms]
root@f3cd4a0715a5:/usr/share/fullstackExercise#

Perfect.

My supposition here is that the issue all along was that I needed to set one of either the ini file option, or the environment variable option before PHPUnit runs. PHPUnit 8 correctly sez "OI!" if I don't; whereas PHPUnit 9 doesn't check correctly, and instead messes things up. Let's roll forward to PHP8 and PHPUnit 9.5 again and see:

root@f7d29f40ad62:/usr/share/fullstackExercise# php -v
PHP 8.0.0 (cli) (built: Dec 1 2020 03:33:03) ( NTS )
Copyright (c) The PHP Group
Zend Engine v4.0.0-dev, Copyright (c) Zend Technologies
with Xdebug v3.0.1, Copyright (c) 2002-2020, by Derick Rethans
root@f7d29f40ad62:/usr/share/fullstackExercise# vendor/bin/phpunit --version
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.

root@f7d29f40ad62:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.
.. 2 / 2 (100%)

Time: 00:01.535, Memory: 10.00 MB

OK (2 tests, 2 assertions)

Generating code coverage report in HTML format ... done [00:00.605]
root@f7d29f40ad62:/usr/share/fullstackExercise#

Hoo-frickin-ray!!

So that supposition was correct then.

I guess I could have saved myself a bunch of time here by just setting the .ini setting or the environment variable in the correct environment right from the outset. Then again had PHPUnit 9 not changed its warnings, I would have known that straight away: PHPUnit 8 was pretty clear on this. The problem and the solution here were pretty mundane, but I learned a lot about how to do things with Docker (which will all go into the other article I'm writing at the same time as this one), some stuff about Xdebug, some stuff about PHP and some stuff about PHPUnit. So it was a worthwhile afternoon of scratching my head. I'm also pleased I've sat here and done 8hrs of "work" for the second day running. Hopefully this will stick and I'll work my way out of my unemployed malaise before long. We'll see.

PHPUnit 9.5 on PHP 7.4

As I said earlier, I should not have changed two things at once when testing this, as it makes it harder to draw accurate conclusions. I omitted to say I did then go back and test PHPUnit 9.5 on PHP 7.4, so as to check whether the issue was due to PHPUnit 9.5, or whether it was some combination of that and PHP version. Here are the test results when running PHPUnit 9.5 on PHP 7.4 with XDEBUG_MODE set in phpunit.xml and not in the environment. xdebug.mode is not set anywhere:

root@81c63ae50626:/usr/share/fullstackExercise# php -v
PHP 7.4.13 (cli) (built: Dec 1 2020 04:25:48) ( NTS )
Copyright (c) The PHP Group
Zend Engine v3.4.0, Copyright (c) Zend Technologies
with Xdebug v3.0.1, Copyright (c) 2002-2020, by Derick Rethans
root@81c63ae50626:/usr/share/fullstackExercise# vendor/bin/phpunit --version
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.
root@81c63ae50626:/usr/share/fullstackExercise# vendor/bin/phpunit
PHPUnit 9.5.0 by Sebastian Bergmann and contributors.
Segmentation fault
root@81c63ae50626:/usr/share/fullstackExercise#

To me this demonstrates the issue is with PHPUnit 9.5 and nothing else.

Anyway... now for a beer.

Righto.

--
Adam

↧